CN111797756A - Video analysis method, device and medium based on artificial intelligence - Google Patents
Video analysis method, device and medium based on artificial intelligence Download PDFInfo
- Publication number
- CN111797756A CN111797756A CN202010622578.6A CN202010622578A CN111797756A CN 111797756 A CN111797756 A CN 111797756A CN 202010622578 A CN202010622578 A CN 202010622578A CN 111797756 A CN111797756 A CN 111797756A
- Authority
- CN
- China
- Prior art keywords
- video
- video segment
- dish
- artificial intelligence
- score
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000004458 analytical method Methods 0.000 title claims abstract description 94
- 238000013473 artificial intelligence Methods 0.000 title claims abstract description 54
- 238000000034 method Methods 0.000 claims abstract description 26
- 238000000605 extraction Methods 0.000 claims abstract description 13
- 230000014509 gene expression Effects 0.000 claims description 60
- 235000003166 Opuntia robusta Nutrition 0.000 claims description 54
- 244000218514 Opuntia robusta Species 0.000 claims description 54
- 238000001514 detection method Methods 0.000 claims description 35
- 238000013145 classification model Methods 0.000 claims description 24
- 230000001815 facial effect Effects 0.000 claims description 24
- 230000006399 behavior Effects 0.000 claims description 22
- 235000012054 meals Nutrition 0.000 claims description 12
- 238000012549 training Methods 0.000 claims description 12
- 230000009467 reduction Effects 0.000 claims description 8
- 238000004364 calculation method Methods 0.000 claims description 6
- 238000002372 labelling Methods 0.000 claims description 6
- 230000001186 cumulative effect Effects 0.000 claims description 4
- 238000012545 processing Methods 0.000 abstract description 5
- 238000005516 engineering process Methods 0.000 abstract description 3
- 238000004590 computer program Methods 0.000 description 12
- 230000006870 function Effects 0.000 description 8
- 230000008569 process Effects 0.000 description 8
- 238000012216 screening Methods 0.000 description 6
- 238000010586 diagram Methods 0.000 description 5
- 238000007726 management method Methods 0.000 description 5
- 241000167926 Entoprocta Species 0.000 description 2
- 206010017577 Gait disturbance Diseases 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 210000000988 bone and bone Anatomy 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 230000004069 differentiation Effects 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 235000013305 food Nutrition 0.000 description 2
- 239000004973 liquid crystal related substance Substances 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 230000001960 triggered effect Effects 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/172—Classification, e.g. identification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0639—Performance analysis of employees; Performance analysis of enterprise or organisation operations
- G06Q10/06393—Score-carding, benchmarking or key performance indicator [KPI] analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/12—Hotels or restaurants
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/41—Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/168—Feature extraction; Face representation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/174—Facial expression recognition
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Human Resources & Organizations (AREA)
- Tourism & Hospitality (AREA)
- Multimedia (AREA)
- Computational Linguistics (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Software Systems (AREA)
- Evolutionary Computation (AREA)
- Economics (AREA)
- Mathematical Physics (AREA)
- Computing Systems (AREA)
- Molecular Biology (AREA)
- Data Mining & Analysis (AREA)
- Human Computer Interaction (AREA)
- Biophysics (AREA)
- Strategic Management (AREA)
- Biomedical Technology (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Marketing (AREA)
- General Business, Economics & Management (AREA)
- Development Economics (AREA)
- Educational Administration (AREA)
- Entrepreneurship & Innovation (AREA)
- Primary Health Care (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Game Theory and Decision Science (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Image Analysis (AREA)
Abstract
The utility model relates to an artificial intelligence and data processing technology field provides a video analysis method, equipment and medium based on artificial intelligence, and its method can carry out the split to original video and obtain the video section, pertinence judgement every condition of having dinner, utilizes the model to carry out the feature extraction to every video section to combine the artificial intelligence means to realize the automatic of feature and draw, analyze the feature, obtain the analytic data, connect appointed sensor, call through the target data of every video section of appointed sensor collection, according to the analytic data reaches the target data calculates, obtains the video analysis result, can carry out comprehensive and pertinence analysis to the video based on artificial intelligence means, and the analytic efficiency is higher. The invention also relates to a block chain technology, and a plurality of pre-trained models are stored on the block chain.
Description
Technical Field
The invention relates to the technical field of artificial intelligence and data processing, in particular to a video analysis method, equipment and medium based on artificial intelligence.
Background
Along with the improvement of the social living standard, more and more people have a good cook while going out to have a meal, and more economic benefits and good social images can be brought to a restaurant.
The satisfaction degree of the customer on the dishes of the chefs is the most direct and fundamental standard for assessing the level of the chefs, but in an actual scene, the feedback condition of the customer on the dishes is difficult to collect, and the operation is inconvenient. The current assessment mode is mainly to evaluate each chef through daily attendance of the chef and a hard regulation system involved in safety production, and some assessment modes also consider the complaint rate of a client, but do not relate to the satisfaction degree of the client to each dish, so that the assessment on the core competitiveness of the chef is lacked.
In addition, at present, the monitoring videos of each restaurant are either manually analyzed or only face recognition is performed on the videos, the utilization rate of the videos is not high, the analysis results corresponding to a single analysis mode are not comprehensive enough, and the mode of performing overall analysis on the acquired videos also causes low analysis efficiency and lack of pertinence.
Disclosure of Invention
In view of the foregoing, there is a need to provide a video analysis method, device and medium based on artificial intelligence, which can perform comprehensive and targeted analysis on videos based on artificial intelligence means, and the analysis efficiency is higher.
An artificial intelligence based video analysis method, comprising:
when video analysis is received, acquiring an original video acquired by an acquisition device;
splitting the original video to obtain at least one video segment;
extracting the characteristics of each video segment by using a plurality of pre-trained models, and outputting at least one characteristic of each video segment;
analyzing at least one characteristic of each video segment to obtain analysis data;
connecting a designated sensor, and calling target data of each video segment acquired by the designated sensor;
and calculating according to the analysis data and the target data to obtain a video analysis result.
According to a preferred embodiment of the present invention, the splitting the original video to obtain at least one video segment includes:
splitting the original videos according to the dining tables to obtain at least one section of dining table video including each dining table;
calling at least one piece of dining information included in each table video in the at least one table video from a specified platform;
acquiring the starting time and the ending time of each dining from the at least one piece of dining information;
splitting each section of table video according to the starting time and the ending time to obtain a split video of each section of table video;
and integrating the split video of each section of table video to obtain the at least one video section.
According to a preferred embodiment of the present invention, the pre-trained models are stored in a blockchain, the extracting features of each video segment using the pre-trained models, and the outputting at least one feature of each video segment includes:
adopting a first target detection model to identify the facial features of each video segment, inputting the identified facial features into an expression classification model, and outputting expression classes corresponding to the facial features in each video segment, wherein the expression classes comprise satisfied expressions and unsatisfied expressions, and the expression classification model is obtained by training face labeling pictures at least one angle;
identifying the dinner plate of each video segment by adopting a second target detection model, determining the coordinates of each dinner plate, and determining the dinner plate in the specified coordinate range as the dinner plate in each video segment;
inputting each video segment into a dish classification model, and outputting each dish corresponding to each dinner plate, wherein the dish classification model is obtained by training a preset dish picture;
and inputting each video segment into a behavior detection model, and outputting behavior characteristics of each customer in each video segment, wherein the behavior characteristics comprise nodding and shaking, and the behavior detection model is obtained by adopting OpenPose and LSTM training.
According to a preferred embodiment of the present invention, the first target detection model is a pre-trained face detector, the expression classification model is a pre-trained random fern classifier, and the recognizing the facial features of each video segment by using the first target detection model, inputting the recognized facial features into the expression classification model, and outputting the expression class corresponding to each facial feature in each video segment includes:
detecting the picture of each video segment by using the face detector to obtain the detection result of each picture;
when the detection result shows that the picture is a face picture, extracting the difference characteristics of the face picture;
and inputting the extracted differential features into the random fern classifier, and outputting the expression category corresponding to each face picture in each video segment.
According to a preferred embodiment of the present invention, the analysis data includes a dinner plate amount in each video segment, a number of times of nodding of a customer, a number of times of shaking of a customer, a number of times of occurrence of satisfied expressions of a customer, and a number of times of occurrence of unsatisfied expressions of a customer, and the method for video analysis based on artificial intelligence further includes:
calculating a first difference value between the number of times of head nodding of the customer and the number of times of head shaking of the customer in each video segment;
calculating a second difference value between the occurrence times of the satisfied expressions of the customer and the occurrence times of the dissatisfied expressions of the customer in each video segment;
acquiring the larger value of the first difference value and the second difference value as the comparison value of each video segment;
when the comparison value of the video segment is larger than or equal to the dinner plate amount, outputting a first score as an integral score of at least one chef corresponding to the video segment; or
When the comparison value of the video segment is less than or equal to the negative number of the dinner plate amount, outputting a second score as the integral score of at least one chef corresponding to the video segment;
and when the comparison value of the video segment is greater than the negative number of the dinner plate amount and less than the dinner plate amount, calculating the quotient of the comparison value and the dinner plate amount as the integral score of at least one chef corresponding to the video segment.
According to a preferred embodiment of the present invention, the target data includes an initial weight and an end weight of each dish corresponding to each service dish in each video segment, and the artificial intelligence based video analysis method further includes:
calculating the difference between the initial weight and the final weight to obtain a reduction amount;
calculating the ratio of the reduction amount to the initial weight;
and carrying out standardization treatment on the ratio to obtain the dish score of each dish.
According to the preferred embodiment of the present invention, the artificial intelligence based video analysis method further comprises:
calculating the sub-score of each chef corresponding to each dish in each video segment according to the overall score of at least one chef corresponding to each video segment and the dish score of each dish by adopting the following formula:
Y=Y2+Y2/(2^Y1)
wherein Y is the sub-score of each chef in each video segment for each corresponding dish, Y1 is the overall score of at least one chef corresponding to each video segment, and Y2 is the dish score of each dish.
According to a preferred embodiment of the present invention, the calculating according to the analysis data and the target data to obtain a video analysis result includes:
determining the occurrence frequency of each dish in each video segment;
calculating the accumulated sum of the sub-scores of each chef for each corresponding dish in each video segment;
for each dish, the quotient of the corresponding cumulative sum and the number of occurrences is calculated as the final score for each chef for the corresponding each dish.
An artificial intelligence based video analytics device, comprising:
the acquisition unit is used for acquiring the original video acquired by the acquisition device when video analysis is received;
the splitting unit is used for splitting the original video to obtain at least one video segment;
the extraction unit is used for extracting the characteristics of each video segment by using a plurality of pre-trained models and outputting at least one characteristic of each video segment;
the analysis unit is used for analyzing at least one characteristic of each video segment to obtain analysis data;
the calling unit is used for connecting a designated sensor and calling the target data of each video segment acquired by the designated sensor;
and the calculating unit is used for calculating according to the analysis data and the target data to obtain a video analysis result.
An electronic device, the electronic device comprising:
a memory storing at least one instruction; and
a processor executing instructions stored in the memory to implement the artificial intelligence based video analytics method.
A computer-readable storage medium having at least one instruction stored therein, the at least one instruction being executable by a processor in an electronic device to implement the artificial intelligence based video analytics method.
According to the technical scheme, when video analysis is received, the original video acquired by the acquisition device is acquired, the original video is split to obtain at least one video segment, so that the dining condition of each table can be judged in a targeted manner subsequently, at least one feature of each video segment is output by utilizing a plurality of pre-trained models and extracting at least one feature of each video segment so as to combine with an artificial intelligence means and extract at least one feature of each video segment by utilizing a plurality of pre-trained models to realize automatic feature extraction, at least one feature of each video segment is analyzed to obtain analysis data, an appointed sensor is connected to retrieve target data of each video segment acquired by the appointed sensor, and a video analysis result is obtained by calculating according to the analysis data and the target data, the video can be comprehensively and pertinently analyzed based on an artificial intelligence means, and the analysis efficiency is higher.
Drawings
FIG. 1 is a flow chart of a preferred embodiment of the artificial intelligence based video analysis method of the present invention.
FIG. 2 is a functional block diagram of a preferred embodiment of the artificial intelligence based video analysis apparatus of the present invention.
Fig. 3 is a schematic structural diagram of an electronic device implementing a video analysis method based on artificial intelligence according to a preferred embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in detail with reference to the accompanying drawings and specific embodiments.
Fig. 1 is a flow chart of a preferred embodiment of the artificial intelligence based video analysis method according to the present invention. The order of the steps in the flow chart may be changed and some steps may be omitted according to different needs.
The video analysis method based on artificial intelligence is applied to one or more electronic devices, which are devices capable of automatically performing numerical calculation and/or information processing according to preset or stored instructions, and the hardware thereof includes, but is not limited to, a microprocessor, an Application Specific Integrated Circuit (ASIC), a Programmable Gate Array (FPGA), a Digital Signal Processor (DSP), an embedded device, and the like.
The electronic device may be any electronic product capable of performing human-computer interaction with a user, for example, a Personal computer, a tablet computer, a smart phone, a Personal Digital Assistant (PDA), a game machine, an interactive Internet Protocol Television (IPTV), an intelligent wearable device, and the like.
The electronic device may also include a network device and/or a user device. The network device includes, but is not limited to, a single network server, a server group consisting of a plurality of network servers, or a cloud computing (cloud computing) based cloud consisting of a large number of hosts or network servers.
The Network where the electronic device is located includes, but is not limited to, the internet, a wide area Network, a metropolitan area Network, a local area Network, a Virtual Private Network (VPN), and the like.
And S10, when the video analysis is received, acquiring the original video acquired by the acquisition device.
In this embodiment, the original video may include a video shot during an examination period.
The acquisition device can comprise at least one camera device, the at least one camera device is used for recording videos in a dining area in real time, and each camera device is installed at a fixed position and used for shooting dining videos of customers in a dining room.
In this embodiment, the video analysis may be triggered by the relevant staff, for example: restaurant managers, restaurant assessment personnel, and the like.
S11, splitting the original video to obtain at least one video segment.
In at least one embodiment of the invention, in order to facilitate the analysis of dining conditions of each table, the electronic device splits the original video to obtain at least one video segment.
Specifically, the splitting the original video to obtain at least one video segment includes:
splitting the original videos according to the dining tables to obtain at least one section of dining table video including each dining table;
calling at least one piece of dining information included in each table video in the at least one table video from a specified platform;
acquiring the starting time and the ending time of each dining from the at least one piece of dining information;
splitting each section of table video according to the starting time and the ending time to obtain a split video of each section of table video;
and integrating the split video of each section of table video to obtain the at least one video section.
Through the embodiment, at least one video segment obtained after splitting corresponds to the complete video of each dining table in the one-time dining process, so that dishes of each chef can be objectively evaluated in a subsequent targeted manner according to the dining condition of each table.
And S12, extracting the features of each video segment by using a plurality of models trained in advance, and outputting at least one feature of each video segment.
In this embodiment, the at least one feature may include, but is not limited to: expression categories, dinner plates, dish behavior characteristics, and the like.
Specifically, the pre-trained models are stored in a blockchain, the extracting features of each video segment by using the pre-trained models, and outputting at least one feature of each video segment includes:
(1) adopting a first target detection model to identify the facial features of each video segment, inputting the identified facial features into an expression classification model, and outputting expression classes corresponding to the facial features in each video segment, wherein the expression classes comprise satisfied expressions and unsatisfied expressions, and the expression classification model is obtained by training face labeling pictures at least one angle;
the first target detection model is trained by adopting the face labeling picture of at least one angle, so that the faces of all angles can be recognized, and the accuracy of the model is higher.
The first target detection model is a pre-trained face detector, and the expression classification model is a pre-trained random fern classifier.
Specifically, the recognizing the facial features of each video segment by using the first target detection model, inputting the recognized facial features into the expression classification model, and outputting the expression category corresponding to each facial feature in each video segment includes:
detecting the picture of each video segment by using the face detector to obtain the detection result of each picture;
when the detection result shows that the picture is a face picture, extracting the difference characteristics of the face picture;
and inputting the extracted differential features into the random fern classifier, and outputting the expression category corresponding to each face picture in each video segment.
It should be noted that the differential feature is adopted because the differential feature is simple to operate, has definite statistical significance and high operation speed, and can improve the classification speed and efficiency of the random fern classifier.
In addition, the scheme has combined face detector and random fern classifier, utilizes face detector to carry out preliminary screening of face characteristic at first, obtains the face picture of preliminary screening, and rethread random fern classifier carries out secondary screening to the face picture of selecting, utilizes random fern classifier to classify the testing result of face detector promptly to ensure the accuracy of classification result, and effectively reduce the false retrieval rate.
(2) Identifying the dinner plate of each video segment by adopting a second target detection model, determining the coordinates of each dinner plate, and determining the dinner plate in the specified coordinate range as the dinner plate in each video segment;
by limiting the coordinate range of the dinner plate, the dinner plate of the adjacent table can be prevented from being mistakenly identified as the dinner plate of the table, the misjudgment phenomenon is reduced, and the identification accuracy is improved.
In this embodiment, the color or shape of each plate may be different for identification.
(3) Inputting each video segment into a dish classification model, and outputting each dish corresponding to each dinner plate, wherein the dish classification model is obtained by training a preset dish picture;
the preset dish pictures can be shot in advance.
(4) Inputting each video segment into a behavior detection model, and outputting behavior characteristics of each customer in each video segment, wherein the behavior characteristics comprise nodding and shaking, and the behavior detection model is obtained by adopting OpenPose and LSTM (Long Short-term-terminal memory network) training.
OpenPose can be used for detecting key points of human bones, and LSTM can extract time sequence characteristics, so that behavior characteristics such as nodding and shaking head can be identified by combining OpenPose and LSTM.
Through the embodiment, an artificial intelligence means can be combined, at least one feature of each video segment can be extracted by adopting a plurality of pre-trained models, and automatic extraction of the features is realized.
And S13, analyzing at least one characteristic of each video segment to obtain analysis data.
In this embodiment, the at least one characteristic of each video segment can be analyzed by metrology, and the invention is not limited.
The analysis data comprises the dinner plate amount in each video segment, the number of times of nodding of a customer, the number of times of shaking the head of the customer, the number of times of appearance of satisfied expressions of the customer, and the number of times of appearance of unsatisfied expressions of the customer, and the video analysis method based on artificial intelligence further comprises the following steps:
calculating a first difference value between the number of times of head nodding of the customer and the number of times of head shaking of the customer in each video segment;
calculating a second difference value between the occurrence times of the satisfied expressions of the customer and the occurrence times of the dissatisfied expressions of the customer in each video segment;
acquiring the larger value of the first difference value and the second difference value as the comparison value of each video segment;
when the comparison value of the video segment is larger than or equal to the dinner plate amount, outputting a first score as an integral score of at least one chef corresponding to the video segment; or
When the comparison value of the video segment is less than or equal to the negative number of the dinner plate amount, outputting a second score as the integral score of at least one chef corresponding to the video segment;
and when the comparison value of the video segment is greater than the negative number of the dinner plate amount and less than the dinner plate amount, calculating the quotient of the comparison value and the dinner plate amount as the integral score of at least one chef corresponding to the video segment.
In this embodiment, the overall score refers to the overall score of the chefs corresponding to all dishes in one dining process of each table.
Through the implementation mode, the integral scoring can be automatically executed for at least one chef corresponding to each video segment, the behavior characteristics such as nodding heads and shaking heads and the expression characteristics such as satisfaction and dissatisfaction are integrated, the change of the food amount in the dinner plate is also considered, and the evaluation is more objective.
And S14, connecting the appointed sensor, and calling the target data of each video segment collected by the appointed sensor.
In this embodiment, the sensor may be a weight sensor disposed within each meal tray for collecting the real-time weight of each meal tray.
The sensors may also be deployed on the table to collect the real-time weight of the meal tray at a given location.
The target data comprises an initial weight and an end weight of each dish corresponding to each dinner plate in each video segment, and the video analysis method based on artificial intelligence further comprises the following steps:
calculating the difference between the initial weight and the final weight to obtain a reduction amount;
calculating the ratio of the reduction amount to the initial weight;
and carrying out standardization treatment on the ratio to obtain the dish score of each dish.
In this embodiment, the dish score refers to the current score of each dish.
Wherein the ratio may be normalized by performing a percentage differentiation process on the ratio, and the present invention is not limited as long as the unity of the units can be achieved.
Through the embodiment, the dish grade of each dish is measured according to the ratio of the consumed amount of each dish, and the popularity of each dish is intuitively reflected.
And S15, calculating according to the analysis data and the target data to obtain a video analysis result.
In at least one embodiment of the invention, the following formula is adopted to calculate the sub-score of each chef for each corresponding dish in each video segment according to the overall score of at least one chef corresponding to each video segment and the dish score of each dish:
Y=Y2+Y2/(2^Y1)
wherein Y is the sub-score of each chef in each video segment for each corresponding dish, Y1 is the overall score of at least one chef corresponding to each video segment, and Y2 is the dish score of each dish.
When the overall score is low, but a certain dish is eaten a lot, the dish is favorite by the customer, and the chef score of the dish should be high. On the contrary, if a dish is almost not passive, but the overall score of the chef is higher, it means that the score of the dish gives the overall back leg dragging, and the chef score should be lower.
Through the implementation mode, the scores of each cook on the dishes in the one-time dining process can be measured according to the overall scores and the dish scores, and two factors are comprehensively considered so as to judge the satisfaction degree of the customer on the dishes made by each cook.
In at least one embodiment of the present invention, the calculating according to the analysis data and the target data to obtain a video analysis result includes:
determining the occurrence frequency of each dish in each video segment;
calculating the accumulated sum of the sub-scores of each chef for each corresponding dish in each video segment;
for each dish, the quotient of the corresponding cumulative sum and the number of occurrences is calculated as the final score for each chef for the corresponding each dish.
Through the implementation mode, each chef in all video bands in an assessment period can be integrated to perform final scoring on each corresponding dish, not only are overall factors considered, but also the satisfaction degree of a customer on each dish is considered, and therefore more comprehensive and targeted judgment can be performed.
In order to improve safety and privacy, the final scores of each chef for each corresponding dish may be stored in the blockchain.
According to the technical scheme, when video analysis is received, the original video acquired by the acquisition device is acquired, the original video is split to obtain at least one video segment, so that the dining condition of each table can be judged in a targeted manner subsequently, at least one feature of each video segment is output by utilizing a plurality of pre-trained models and extracting at least one feature of each video segment so as to combine with an artificial intelligence means and extract at least one feature of each video segment by utilizing a plurality of pre-trained models to realize automatic feature extraction, at least one feature of each video segment is analyzed to obtain analysis data, an appointed sensor is connected to retrieve target data of each video segment acquired by the appointed sensor, and a video analysis result is obtained by calculating according to the analysis data and the target data, the video can be comprehensively and pertinently analyzed based on an artificial intelligence means, and the analysis efficiency is higher.
Fig. 2 is a functional block diagram of a preferred embodiment of the artificial intelligence based video analysis apparatus according to the present invention. The artificial intelligence based video analysis device 11 includes an acquisition unit 110, a splitting unit 111, an extraction unit 112, an analysis unit 113, a calculation unit 114, and a retrieval unit 115. The module/unit referred to in the present invention refers to a series of computer program segments that can be executed by the processor 13 and that can perform a fixed function, and that are stored in the memory 12. In the present embodiment, the functions of the modules/units will be described in detail in the following embodiments.
When receiving the video analysis, the acquisition unit 110 acquires an original video acquired by the acquisition device.
In this embodiment, the original video may include a video shot during an examination period.
The acquisition device can comprise at least one camera device, the at least one camera device is used for recording videos in a dining area in real time, and each camera device is installed at a fixed position and used for shooting dining videos of customers in a dining room.
In this embodiment, the video analysis may be triggered by the relevant staff, for example: restaurant managers, restaurant assessment personnel, and the like.
The splitting unit 111 splits the original video to obtain at least one video segment.
In at least one embodiment of the present invention, in order to analyze dining conditions of each table, the splitting unit 111 splits the original video to obtain at least one video segment.
Specifically, the splitting unit 111 splits the original video to obtain at least one video segment, including:
splitting the original videos according to the dining tables to obtain at least one section of dining table video including each dining table;
calling at least one piece of dining information included in each table video in the at least one table video from a specified platform;
acquiring the starting time and the ending time of each dining from the at least one piece of dining information;
splitting each section of table video according to the starting time and the ending time to obtain a split video of each section of table video;
and integrating the split video of each section of table video to obtain the at least one video section.
Through the embodiment, at least one video segment obtained after splitting corresponds to the complete video of each dining table in the one-time dining process, so that dishes of each chef can be objectively evaluated in a subsequent targeted manner according to the dining condition of each table.
The extraction unit 112 performs feature extraction on each video segment by using a plurality of models trained in advance, and outputs at least one feature of each video segment.
In this embodiment, the at least one feature may include, but is not limited to: expression categories, dinner plates, dish behavior characteristics, and the like.
Specifically, the pre-trained models are stored in a blockchain, the extracting unit 112 performs feature extraction on each video segment by using the pre-trained models, and outputting at least one feature of each video segment includes:
(1) adopting a first target detection model to identify the facial features of each video segment, inputting the identified facial features into an expression classification model, and outputting expression classes corresponding to the facial features in each video segment, wherein the expression classes comprise satisfied expressions and unsatisfied expressions, and the expression classification model is obtained by training face labeling pictures at least one angle;
the first target detection model is trained by adopting the face labeling picture of at least one angle, so that the faces of all angles can be recognized, and the accuracy of the model is higher.
The first target detection model is a pre-trained face detector, and the expression classification model is a pre-trained random fern classifier.
Specifically, the extracting unit 112 identifies the facial features of each video segment by using the first target detection model, inputs the identified facial features into the expression classification model, and outputs the expression category corresponding to each facial feature in each video segment, including:
detecting the picture of each video segment by using the face detector to obtain the detection result of each picture;
when the detection result shows that the picture is a face picture, extracting the difference characteristics of the face picture;
and inputting the extracted differential features into the random fern classifier, and outputting the expression category corresponding to each face picture in each video segment.
It should be noted that the differential feature is adopted because the differential feature is simple to operate, has definite statistical significance and high operation speed, and can improve the classification speed and efficiency of the random fern classifier.
In addition, the scheme has combined face detector and random fern classifier, utilizes face detector to carry out preliminary screening of face characteristic at first, obtains the face picture of preliminary screening, and rethread random fern classifier carries out secondary screening to the face picture of selecting, utilizes random fern classifier to classify the testing result of face detector promptly to ensure the accuracy of classification result, and effectively reduce the false retrieval rate.
(2) Identifying the dinner plate of each video segment by adopting a second target detection model, determining the coordinates of each dinner plate, and determining the dinner plate in the specified coordinate range as the dinner plate in each video segment;
by limiting the coordinate range of the dinner plate, the dinner plate of the adjacent table can be prevented from being mistakenly identified as the dinner plate of the table, the misjudgment phenomenon is reduced, and the identification accuracy is improved.
In this embodiment, the color or shape of each plate may be different for identification.
(3) Inputting each video segment into a dish classification model, and outputting each dish corresponding to each dinner plate, wherein the dish classification model is obtained by training a preset dish picture;
the preset dish pictures can be shot in advance.
(4) Inputting each video segment into a behavior detection model, and outputting behavior characteristics of each customer in each video segment, wherein the behavior characteristics comprise nodding and shaking, and the behavior detection model is obtained by adopting OpenPose and LSTM (Long Short-term-terminal memory network) training.
OpenPose can be used for detecting key points of human bones, and LSTM can extract time sequence characteristics, so that behavior characteristics such as nodding and shaking head can be identified by combining OpenPose and LSTM.
Through the embodiment, an artificial intelligence means can be combined, at least one feature of each video segment can be extracted by adopting a plurality of pre-trained models, and automatic extraction of the features is realized.
The analysis unit 113 analyzes at least one feature of each video segment to obtain analysis data.
In this embodiment, the at least one characteristic of each video segment can be analyzed by metrology, and the invention is not limited.
The analysis data comprises the dinner plate amount in each video segment, the number of times of nodding of a customer, the number of times of shaking the head of the customer, the number of times of appearance of satisfied expressions of the customer, and the number of times of appearance of unsatisfied expressions of the customer:
calculating a first difference value between the number of times of head nodding of the customer and the number of times of head shaking of the customer in each video segment;
calculating a second difference value between the occurrence times of the satisfied expressions of the customer and the occurrence times of the dissatisfied expressions of the customer in each video segment;
acquiring the larger value of the first difference value and the second difference value as the comparison value of each video segment;
when the comparison value of the video segment is larger than or equal to the dinner plate amount, outputting a first score as an integral score of at least one chef corresponding to the video segment; or
When the comparison value of the video segment is less than or equal to the negative number of the dinner plate amount, outputting a second score as the integral score of at least one chef corresponding to the video segment;
and when the comparison value of the video segment is greater than the negative number of the dinner plate amount and less than the dinner plate amount, calculating the quotient of the comparison value and the dinner plate amount as the integral score of at least one chef corresponding to the video segment.
In this embodiment, the overall score refers to the overall score of the chefs corresponding to all dishes in one dining process of each table.
Through the implementation mode, the integral scoring can be automatically executed for at least one chef corresponding to each video segment, the behavior characteristics such as nodding heads and shaking heads and the expression characteristics such as satisfaction and dissatisfaction are integrated, the change of the food amount in the dinner plate is also considered, and the evaluation is more objective.
The retrieval unit 115 is connected to a designated sensor, and retrieves target data of each video segment acquired by the designated sensor.
In this embodiment, the sensor may be a weight sensor disposed within each meal tray for collecting the real-time weight of each meal tray.
The sensors may also be deployed on the table to collect the real-time weight of the meal tray at a given location.
The target data includes an initial weight, an end weight, of each meal corresponding to each meal tray in each video segment:
calculating the difference between the initial weight and the final weight to obtain a reduction amount;
calculating the ratio of the reduction amount to the initial weight;
and carrying out standardization treatment on the ratio to obtain the dish score of each dish.
In this embodiment, the dish score refers to the current score of each dish.
Wherein the ratio may be normalized by performing a percentage differentiation process on the ratio, and the present invention is not limited as long as the unity of the units can be achieved.
Through the embodiment, the dish grade of each dish is measured according to the ratio of the consumed amount of each dish, and the popularity of each dish is intuitively reflected.
The calculating unit 114 performs calculation according to the analysis data and the target data to obtain a video analysis result.
In at least one embodiment of the present invention, the calculating unit 114 calculates the sub-score of each chef for each dish in each video segment according to the overall score of at least one chef corresponding to each video segment and the dish score of each dish by using the following formula:
Y=Y2+Y2/(2^Y1)
wherein Y is the sub-score of each chef in each video segment for each corresponding dish, Y1 is the overall score of at least one chef corresponding to each video segment, and Y2 is the dish score of each dish.
When the overall score is low, but a certain dish is eaten a lot, the dish is favorite by the customer, and the chef score of the dish should be high. On the contrary, if a dish is almost not passive, but the overall score of the chef is higher, it means that the score of the dish gives the overall back leg dragging, and the chef score should be lower.
Through the implementation mode, the scores of each cook on the dishes in the one-time dining process can be measured according to the overall scores and the dish scores, and two factors are comprehensively considered so as to judge the satisfaction degree of the customer on the dishes made by each cook.
In at least one embodiment of the present invention, the calculating unit 114 performs calculation according to the analysis data and the target data, and obtaining a video analysis result includes:
determining the occurrence frequency of each dish in each video segment;
calculating the accumulated sum of the sub-scores of each chef for each corresponding dish in each video segment;
for each dish, the quotient of the corresponding cumulative sum and the number of occurrences is calculated as the final score for each chef for the corresponding each dish.
Through the implementation mode, each chef in all video bands in an assessment period can be integrated to perform final scoring on each corresponding dish, not only are overall factors considered, but also the satisfaction degree of a customer on each dish is considered, and therefore more comprehensive and targeted judgment can be performed.
In order to improve safety and privacy, the final scores of each chef for each corresponding dish may be stored in the blockchain.
According to the technical scheme, when video analysis is received, the original video acquired by the acquisition device is acquired, the original video is split to obtain at least one video segment, so that the dining condition of each table can be judged in a targeted manner subsequently, at least one feature of each video segment is output by utilizing a plurality of pre-trained models and extracting at least one feature of each video segment so as to combine with an artificial intelligence means and extract at least one feature of each video segment by utilizing a plurality of pre-trained models to realize automatic feature extraction, at least one feature of each video segment is analyzed to obtain analysis data, an appointed sensor is connected to retrieve target data of each video segment acquired by the appointed sensor, and a video analysis result is obtained by calculating according to the analysis data and the target data, the video can be comprehensively and pertinently analyzed based on an artificial intelligence means, and the analysis efficiency is higher.
Fig. 3 is a schematic structural diagram of an electronic device according to a preferred embodiment of the present invention for implementing an artificial intelligence-based video analysis method.
The electronic device 1 may comprise a memory 12, a processor 13 and a bus, and may further comprise a computer program, such as an artificial intelligence based video analysis program, stored in the memory 12 and executable on the processor 13.
It will be understood by those skilled in the art that the schematic diagram is merely an example of the electronic device 1, and does not constitute a limitation to the electronic device 1, the electronic device 1 may have a bus-type structure or a star-type structure, the electronic device 1 may further include more or less hardware or software than those shown in the figures, or different component arrangements, for example, the electronic device 1 may further include an input and output device, a network access device, and the like.
It should be noted that the electronic device 1 is only an example, and other existing or future electronic products, such as those that can be adapted to the present invention, should also be included in the scope of the present invention, and are included herein by reference.
The memory 12 includes at least one type of readable storage medium, which includes flash memory, removable hard disks, multimedia cards, card-type memory (e.g., SD or DX memory, etc.), magnetic memory, magnetic disks, optical disks, etc. The memory 12 may in some embodiments be an internal storage unit of the electronic device 1, for example a removable hard disk of the electronic device 1. The memory 12 may also be an external storage device of the electronic device 1 in other embodiments, such as a plug-in mobile hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like provided on the electronic device 1. Further, the memory 12 may also include both an internal storage unit and an external storage device of the electronic device 1. The memory 12 may be used not only to store application software installed in the electronic device 1 and various types of data, such as codes of an artificial intelligence based video analysis program, etc., but also to temporarily store data that has been output or is to be output.
The processor 13 may be composed of an integrated circuit in some embodiments, for example, a single packaged integrated circuit, or may be composed of a plurality of integrated circuits packaged with the same or different functions, including one or more Central Processing Units (CPUs), microprocessors, digital Processing chips, graphics processors, and combinations of various control chips. The processor 13 is a Control Unit (Control Unit) of the electronic device 1, connects various components of the electronic device 1 by using various interfaces and lines, and executes various functions and processes data of the electronic device 1 by running or executing programs or modules (e.g., executing an artificial intelligence-based video analysis program, etc.) stored in the memory 12 and calling data stored in the memory 12.
The processor 13 executes an operating system of the electronic device 1 and various installed application programs. The processor 13 executes the application program to implement the steps in the various artificial intelligence based video analytics embodiments described above, such as the steps shown in fig. 1.
Illustratively, the computer program may be divided into one or more modules/units, which are stored in the memory 12 and executed by the processor 13 to accomplish the present invention. The one or more modules/units may be a series of computer program instruction segments capable of performing specific functions, which are used to describe the execution process of the computer program in the electronic device 1. For example, the computer program may be divided into an acquisition unit 110, a splitting unit 111, an extraction unit 112, an analysis unit 113, a calculation unit 114, a retrieval unit 115.
Alternatively, the processor 13, when executing the computer program, implements the functions of the modules/units in the above device embodiments, for example:
when video analysis is received, acquiring an original video acquired by an acquisition device;
splitting the original video to obtain at least one video segment;
extracting the characteristics of each video segment by using a plurality of pre-trained models, and outputting at least one characteristic of each video segment;
analyzing at least one characteristic of each video segment to obtain analysis data;
connecting a designated sensor, and calling target data of each video segment acquired by the designated sensor;
and calculating according to the analysis data and the target data to obtain a video analysis result.
The integrated unit implemented in the form of a software functional module may be stored in a computer-readable storage medium. The software functional module is stored in a storage medium and includes several instructions to enable a computer device (which may be a personal computer, a computer device, or a network device) or a processor (processor) to execute parts of the artificial intelligence based video analysis method according to the embodiments of the present invention.
The integrated modules/units of the electronic device 1 may be stored in a computer-readable storage medium if they are implemented in the form of software functional units and sold or used as separate products. Based on such understanding, all or part of the flow of the method according to the embodiments of the present invention may be implemented by a computer program, which may be stored in a computer-readable storage medium, and when the computer program is executed by a processor, the steps of the method embodiments may be implemented.
Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying said computer program code, recording medium, U-disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM).
Further, the computer-usable storage medium may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created according to the use of the blockchain node, and the like.
The block chain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.
The bus may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one arrow is shown in FIG. 3, but this does not indicate only one bus or one type of bus. The bus is arranged to enable connection communication between the memory 12 and at least one processor 13 or the like.
Although not shown, the electronic device 1 may further include a power supply (such as a battery) for supplying power to each component, and preferably, the power supply may be logically connected to the at least one processor 13 through a power management device, so as to implement functions of charge management, discharge management, power consumption management, and the like through the power management device. The power supply may also include any component of one or more dc or ac power sources, recharging devices, power failure detection circuitry, power converters or inverters, power status indicators, and the like. The electronic device 1 may further include various sensors, a bluetooth module, a Wi-Fi module, and the like, which are not described herein again.
Further, the electronic device 1 may further include a network interface, and optionally, the network interface may include a wired interface and/or a wireless interface (such as a WI-FI interface, a bluetooth interface, etc.), which are generally used for establishing a communication connection between the electronic device 1 and other electronic devices.
Optionally, the electronic device 1 may further comprise a user interface, which may be a Display (Display), an input unit (such as a Keyboard), and optionally a standard wired interface, a wireless interface. Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch device, or the like. The display, which may also be referred to as a display screen or display unit, is suitable for displaying information processed in the electronic device 1 and for displaying a visualized user interface, among other things.
It is to be understood that the described embodiments are for purposes of illustration only and that the scope of the appended claims is not limited to such structures.
Fig. 3 only shows the electronic device 1 with components 12-13, and it will be understood by a person skilled in the art that the structure shown in fig. 3 does not constitute a limitation of the electronic device 1, and may comprise fewer or more components than shown, or a combination of certain components, or a different arrangement of components.
With reference to fig. 1, the memory 12 of the electronic device 1 stores a plurality of instructions to implement an artificial intelligence based video analysis method, and the processor 13 executes the plurality of instructions to implement:
when video analysis is received, acquiring an original video acquired by an acquisition device;
splitting the original video to obtain at least one video segment;
extracting the characteristics of each video segment by using a plurality of pre-trained models, and outputting at least one characteristic of each video segment;
analyzing at least one characteristic of each video segment to obtain analysis data;
connecting a designated sensor, and calling target data of each video segment acquired by the designated sensor;
and calculating according to the analysis data and the target data to obtain a video analysis result.
Specifically, the processor 13 may refer to the description of the relevant steps in the embodiment corresponding to fig. 1 for a specific implementation method of the instruction, which is not described herein again.
In the embodiments provided in the present invention, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is only one logical functional division, and other divisions may be realized in practice.
The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.
In addition, functional modules in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional module.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof.
The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference signs in the claims shall not be construed as limiting the claim concerned.
Furthermore, it is obvious that the word "comprising" does not exclude other elements or steps, and the singular does not exclude the plural. A plurality of units or means recited in the system claims may also be implemented by one unit or means in software or hardware. The terms second, etc. are used to denote names, but not any particular order.
Finally, it should be noted that the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, and although the present invention is described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention.
Claims (10)
1. A video analysis method based on artificial intelligence is characterized in that the video analysis method based on artificial intelligence comprises the following steps:
when video analysis is received, acquiring an original video acquired by an acquisition device;
splitting the original video to obtain at least one video segment;
extracting the characteristics of each video segment by using a plurality of pre-trained models, and outputting at least one characteristic of each video segment;
analyzing at least one characteristic of each video segment to obtain analysis data;
connecting a designated sensor, and calling target data of each video segment acquired by the designated sensor;
and calculating according to the analysis data and the target data to obtain a video analysis result.
2. The artificial intelligence based video analytics method of claim 1, wherein said splitting the original video into at least one video segment comprises:
splitting the original videos according to the dining tables to obtain at least one section of dining table video including each dining table;
calling at least one piece of dining information included in each table video in the at least one table video from a specified platform;
acquiring the starting time and the ending time of each dining from the at least one piece of dining information;
splitting each section of table video according to the starting time and the ending time to obtain a split video of each section of table video;
and integrating the split video of each section of table video to obtain the at least one video section.
3. The artificial intelligence based video analytics method of claim 1, wherein the pre-trained models are stored on a blockchain, the feature extraction for each video segment using the pre-trained models, the outputting the at least one feature of each video segment comprising:
adopting a first target detection model to identify the facial features of each video segment, inputting the identified facial features into an expression classification model, and outputting expression classes corresponding to the facial features in each video segment, wherein the expression classes comprise satisfied expressions and unsatisfied expressions, and the expression classification model is obtained by training face labeling pictures at least one angle;
identifying the dinner plate of each video segment by adopting a second target detection model, determining the coordinates of each dinner plate, and determining the dinner plate in the specified coordinate range as the dinner plate in each video segment;
inputting each video segment into a dish classification model, and outputting each dish corresponding to each dinner plate, wherein the dish classification model is obtained by training a preset dish picture;
and inputting each video segment into a behavior detection model, and outputting behavior characteristics of each customer in each video segment, wherein the behavior characteristics comprise nodding and shaking, and the behavior detection model is obtained by adopting OpenPose and LSTM training.
4. The artificial intelligence based video analysis method according to claim 3, wherein the first target detection model is a pre-trained face detector, the expression classification model is a pre-trained random fern classifier, and the recognizing the facial features of each video segment by using the first target detection model and inputting the recognized facial features into the expression classification model, and outputting the expression class corresponding to each facial feature in each video segment comprises:
detecting the picture of each video segment by using the face detector to obtain the detection result of each picture;
when the detection result shows that the picture is a face picture, extracting the difference characteristics of the face picture;
and inputting the extracted differential features into the random fern classifier, and outputting the expression category corresponding to each face picture in each video segment.
5. The artificial intelligence based video analytics method of claim 1, wherein the analytics data comprises a meal tray amount in each video segment, a number of nods by a customer, a number of shakes by a customer, a number of occurrences of satisfactory expressions by a customer, a number of occurrences of unsatisfactory expressions by a customer, the artificial intelligence based video analytics method further comprising:
calculating a first difference value between the number of times of head nodding of the customer and the number of times of head shaking of the customer in each video segment;
calculating a second difference value between the occurrence times of the satisfied expressions of the customer and the occurrence times of the dissatisfied expressions of the customer in each video segment;
acquiring the larger value of the first difference value and the second difference value as the comparison value of each video segment;
when the comparison value of the video segment is larger than or equal to the dinner plate amount, outputting a first score as an integral score of at least one chef corresponding to the video segment; or
When the comparison value of the video segment is less than or equal to the negative number of the dinner plate amount, outputting a second score as the integral score of at least one chef corresponding to the video segment;
and when the comparison value of the video segment is greater than the negative number of the dinner plate amount and less than the dinner plate amount, calculating the quotient of the comparison value and the dinner plate amount as the integral score of at least one chef corresponding to the video segment.
6. The artificial intelligence based video analytics method of claim 1, wherein the target data comprises an initial weight, an end weight of each meal corresponding to each meal tray in each video segment, the artificial intelligence based video analytics method further comprising:
calculating the difference between the initial weight and the final weight to obtain a reduction amount;
calculating the ratio of the reduction amount to the initial weight;
and carrying out standardization treatment on the ratio to obtain the dish score of each dish.
7. The artificial intelligence based video analytics method of claim 5 or 6, further comprising:
calculating the sub-score of each chef corresponding to each dish in each video segment according to the overall score of at least one chef corresponding to each video segment and the dish score of each dish by adopting the following formula:
Y=Y2+Y2/(2^Y1)
wherein Y is the sub-score of each chef in each video segment for each corresponding dish, Y1 is the overall score of at least one chef corresponding to each video segment, and Y2 is the dish score of each dish.
8. The artificial intelligence based video analysis method of claim 7, wherein said performing a calculation based on said analysis data and said target data to obtain a video analysis result comprises:
determining the occurrence frequency of each dish in each video segment;
calculating the accumulated sum of the sub-scores of each chef for each corresponding dish in each video segment;
for each dish, the quotient of the corresponding cumulative sum and the number of occurrences is calculated as the final score for each chef for the corresponding each dish.
9. An electronic device, characterized in that the electronic device comprises:
a memory storing at least one instruction; and
a processor executing instructions stored in the memory to implement the artificial intelligence based video analytics method of any of claims 1 to 8.
10. A computer-readable storage medium characterized by: the computer-readable storage medium has stored therein at least one instruction that is executable by a processor in an electronic device to implement the artificial intelligence based video analytics method of any of claims 1-8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010622578.6A CN111797756A (en) | 2020-06-30 | 2020-06-30 | Video analysis method, device and medium based on artificial intelligence |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010622578.6A CN111797756A (en) | 2020-06-30 | 2020-06-30 | Video analysis method, device and medium based on artificial intelligence |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111797756A true CN111797756A (en) | 2020-10-20 |
Family
ID=72809915
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010622578.6A Pending CN111797756A (en) | 2020-06-30 | 2020-06-30 | Video analysis method, device and medium based on artificial intelligence |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111797756A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112235596A (en) * | 2020-12-10 | 2021-01-15 | 杭州次元岛科技有限公司 | Live webcast-based food recommendation method and device |
CN112437279A (en) * | 2020-11-23 | 2021-03-02 | 方战领 | Video analysis method and device |
CN113032460A (en) * | 2021-03-24 | 2021-06-25 | 中国长江电力股份有限公司 | Dining room ordering and preparing system with overall process data analysis and data analysis method |
CN113657203A (en) * | 2021-07-28 | 2021-11-16 | 陈蕾 | Intelligent device operating system based on block chain |
CN114743153A (en) * | 2022-06-10 | 2022-07-12 | 北京航空航天大学杭州创新研究院 | Non-sensory dish-taking model establishing and dish-taking method and device based on video understanding |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104677481A (en) * | 2015-03-13 | 2015-06-03 | 广州视源电子科技股份有限公司 | Food weight monitoring method and food weight monitoring device |
CN105243270A (en) * | 2015-09-23 | 2016-01-13 | 小米科技有限责任公司 | Diet monitoring method, apparatus and system and catering furniture |
CN107886568A (en) * | 2017-12-09 | 2018-04-06 | 东方梦幻文化产业投资有限公司 | A kind of method and system that human face expression is rebuild using 3D Avatar |
CN108197544A (en) * | 2017-12-22 | 2018-06-22 | 深圳云天励飞技术有限公司 | Human face analysis, filter method, device, embedded device, medium and integrated circuit |
US20180336603A1 (en) * | 2017-05-22 | 2018-11-22 | Fujitsu Limited | Restaurant review systems |
CN109508664A (en) * | 2018-10-26 | 2019-03-22 | 浙江师范大学 | A kind of vegetable identification pricing method based on deep learning |
CN109977854A (en) * | 2019-03-25 | 2019-07-05 | 浙江新再灵科技股份有限公司 | Unusual checking analysis system under a kind of elevator monitoring environment |
CN110931109A (en) * | 2019-12-06 | 2020-03-27 | 杭州雄伟科技开发股份有限公司 | Diet condition analysis method and system |
-
2020
- 2020-06-30 CN CN202010622578.6A patent/CN111797756A/en active Pending
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104677481A (en) * | 2015-03-13 | 2015-06-03 | 广州视源电子科技股份有限公司 | Food weight monitoring method and food weight monitoring device |
CN105243270A (en) * | 2015-09-23 | 2016-01-13 | 小米科技有限责任公司 | Diet monitoring method, apparatus and system and catering furniture |
US20180336603A1 (en) * | 2017-05-22 | 2018-11-22 | Fujitsu Limited | Restaurant review systems |
CN107886568A (en) * | 2017-12-09 | 2018-04-06 | 东方梦幻文化产业投资有限公司 | A kind of method and system that human face expression is rebuild using 3D Avatar |
CN108197544A (en) * | 2017-12-22 | 2018-06-22 | 深圳云天励飞技术有限公司 | Human face analysis, filter method, device, embedded device, medium and integrated circuit |
CN109508664A (en) * | 2018-10-26 | 2019-03-22 | 浙江师范大学 | A kind of vegetable identification pricing method based on deep learning |
CN109977854A (en) * | 2019-03-25 | 2019-07-05 | 浙江新再灵科技股份有限公司 | Unusual checking analysis system under a kind of elevator monitoring environment |
CN110931109A (en) * | 2019-12-06 | 2020-03-27 | 杭州雄伟科技开发股份有限公司 | Diet condition analysis method and system |
Non-Patent Citations (1)
Title |
---|
余争平: "军事作业医学", vol. 1, 31 March 2009, 军事医学科学出版社, pages: 104 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112437279A (en) * | 2020-11-23 | 2021-03-02 | 方战领 | Video analysis method and device |
CN112235596A (en) * | 2020-12-10 | 2021-01-15 | 杭州次元岛科技有限公司 | Live webcast-based food recommendation method and device |
CN112235596B (en) * | 2020-12-10 | 2021-03-19 | 杭州次元岛科技有限公司 | Live webcast-based food recommendation method and device |
CN113032460A (en) * | 2021-03-24 | 2021-06-25 | 中国长江电力股份有限公司 | Dining room ordering and preparing system with overall process data analysis and data analysis method |
CN113657203A (en) * | 2021-07-28 | 2021-11-16 | 陈蕾 | Intelligent device operating system based on block chain |
CN114743153A (en) * | 2022-06-10 | 2022-07-12 | 北京航空航天大学杭州创新研究院 | Non-sensory dish-taking model establishing and dish-taking method and device based on video understanding |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111797756A (en) | Video analysis method, device and medium based on artificial intelligence | |
Vankipuram et al. | Toward automated workflow analysis and visualization in clinical environments | |
CN111770317B (en) | Video monitoring method, device, equipment and medium for intelligent community | |
CN110414370B (en) | Face shape recognition method and device, electronic equipment and storage medium | |
CN109376982B (en) | Target employee selection method and device | |
CN112100425B (en) | Label labeling method and device based on artificial intelligence, electronic equipment and medium | |
US20220125360A1 (en) | Method and computer program for determining psychological state through drawing process of counseling recipient | |
WO2021068781A1 (en) | Fatigue state identification method, apparatus and device | |
CN110299193A (en) | Chinese medicine health cloud service method based on artificial intelligence lingual diagnosis | |
CN114066534A (en) | Elevator advertisement delivery method, device, equipment and medium based on artificial intelligence | |
CN114821483B (en) | Monitoring method and system capable of measuring temperature and applied to monitoring video | |
CN111986744A (en) | Medical institution patient interface generation method and device, electronic device and medium | |
CN114334169A (en) | Medical object category decision method and device, electronic equipment and storage medium | |
CN114781805A (en) | Nursing staff nursing skill evaluation method, system and device based on big data | |
CN110796014A (en) | Garbage throwing habit analysis method, system and device and storage medium | |
CN114639152A (en) | Multi-modal voice interaction method, device, equipment and medium based on face recognition | |
CN110825808A (en) | Distributed human face database system based on edge calculation and generation method thereof | |
CN114334175A (en) | Hospital epidemic situation monitoring method and device, computer equipment and storage medium | |
TW201502999A (en) | A behavior identification and follow up system | |
US11151598B2 (en) | Scoring image engagement in digital media | |
CN109801394B (en) | Staff attendance checking method and device, electronic equipment and readable storage medium | |
US20200065631A1 (en) | Produce Assessment System | |
CN116030940A (en) | Psychological evaluation management method and system based on big data | |
JP2010198199A (en) | Information providing system and method | |
CN112328752B (en) | Course recommendation method and device based on search content, computer equipment and medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |