Creating a Grocery Product and Recipe Recommender

Mrinalini Sundar 16 Apr, 2021 • 8 min read
This article was published as a part of the Data Science Blogathon.


We all purchase groceries whether it is once a month or weekly. Grocery shopping from stores such as Walmart or Kroger typically includes vegetables, meat, dairy products, and other household requisitorias. Some people even have a repetitive common list of items that they purchase every time they visit such stores. But how can we analyze this data and provide better services to the customers? And how can we sell products related to hot purchases or related to a targeted customer purchase record?

In this blog, we will explore how data from multiple sources can be brought in and analyzed in order to get useful information. This information can further be used to improve customer engagement, list and sell products to targeted customers, provide better quality products, and reduce cost. We will also see how the latest technology such as apps and websites (ads) can offer product recommendations to customers which will increase their chances of purchasing those products for the recipes suggested.



First, we analyze the customer’s needs, and then we determine what his most common purchases are. To achieve this, we have a customer purchase dataset. This dataset was mainly collected from one of Kroger retail stores in the USA. Interestingly, the data was collected during a super-saver weekend to understand more about the customer buying behavior. It can be found here.

Other than this there are many open-source datasets that can be used for this analysis. We will also be suggesting recipes to customers based on their recent purchases or most common purchases of that area (or customer). These datasets include;


Just like this customer purchase data set and recipe predictions, one can find several other suggestions on the internet. It is better to merge the datasets and recipes into one. This would make two master files and make the analysis process easier.



First, we need to analyze the data from the customer grocery purchase dataset. We then find the products or categories that are most purchased by the customers. After processing the raw file, we get the required results. The next step is to compare the results with the most bought items per customer. This is an important phase as all further processing will depend on the results of this analysis.

  1. Overall Purchased Groceries

    According to the results, the most purchased products from the grocery stores were Dairy Products including Milk, Yogurt, and buns. Then vegetables and fruits, and then daily life requisitorias like bottled water, soda, meat, etc. The graphical representation of this analysis is;

  1. Personalized Purchased Groceries

    Now, we have a clear understanding of the products that are sold most and the ones that customers need the most. But we still don’t have the list of products a customer purchases. For that, we will need to compare the record of a single customer’s purchase with that dataset and see what are the most common combinations of products that were bought. The result of this analysis is the most common products that are purchased by the customers.

    One way to approach this is by procuring the results of Overall Purchased Analysis, then compare it directly with the original dataset and find which of the products are in an average customers products list. By following this process, we can generate a list of products that are on almost every customer’s list at the end of the month or week. For instance, Whole Milk, Beef, Vegetables, Fruits, etc. are things that are on the list of almost every customer going to the grocery store for monthly shopping.



Predicting Products & Ingredients

Now that we have analyzed our data and have the required format of these datasets, we can move to our next stage, which is predicting ingredients for recipes, or products based on the most related product list of the customer. We can predict products and recipes based on the current list of customers and the result of analysis of the common list that matches most with the current list. Recipe Recommender


  1. Predicting Products

    To do so, we can use a simple approach of comparing the product list that the customer has (or may have) with the master dataset of common product sets that came as a result of common products analysis. We can see what are the most common sets are related to these and what products are not on this list. Then compare the not present products with the most bought products and display those not present in order of a number of purchases (popularity).


  1. Predicting Ingredients of Recipes

    If we are predicting recipes, then we can use a different approach to predict products. Instead of most commonly purchased products, we can compare the products on the customer list with the products on recipes. For instance, the user has a list of products including Bread and Ham, comparing it to the dataset may result in the recipe of sandwich. So, it predicts products like Mayonnaise, and Cheese to the customer to make a sandwich.

    Note: The dataset includes recipes too (How to make a sandwich), so that can also be displayed to the user in the app.


Case Study

Now let’s consider a case study, a customer has your app installed and uses it for online grocery orders, or to make a list of groceries he needs. When he starts making a list, he is shown some recommendations of recipes and their required ingredients. These recipes are those whose ingredients match with the products he regularly or commonly purchases.


  1. Predicting Products

    As an example, let’s assume your app lists grocery items. When the customer adds products to his list, the recipes for the recommendation decrease as the ingredients become more and more specific. Just like the picture shows, you can sort the products the customer wants to buy into categories and then predict recipes based on any particular category.

    For instance, if the user has entered Bananas, you can suggest Milk, with respect to the recipe of banana shake. The same is the case with others If the user has selected eggs, cocoa powder, flour. You can suggest the recipe for a chocolate cake, and list all the remaining ingredients that he will have to add to make that cake.

    Even if you don’t want to make a shopping or grocery app, you can make a grocery item listing app and can add a prediction feature as a key marketing feature. As you further add more features and process different datasets of different regions, you can also add region-wise predictions.

  1. Predicting Recipes

    Predicting Recipe Recommender



    For instance, let us assume your app also displays recommended recipes based on products that are added so far to the list. Or you may also allow users to search recipes by certain tags or ingredients in all that can be taken care of easily by a few tweaks in the existing program.

    When the user clicks show recommended recipes or if you decide to display them in a panel or whatever the UI design suggests, he can see and scroll through many recipes that are connected with the ingredients that he has written in the grocery list.

    This data can also target an audience that purchases groceries online. Every time, the customer purchases products online, or adds them to the list, they can be stored in the backend server database, which can be used to predict future products and ingredients based on recipes of his most common purchases.

    His commonly used recipes can also be used in the same way, but instead of processing to determine ingredients, in this case, we can suggest ingredients based on his saved common recipes in our database.database



  1. Predicting Ingredients of the Recipes

    Taking the same app, when the customer clicks on any of the predicted recipes, its ingredients are displayed, and the user can browse through them and check their availability from the online grocer. He has an add recipe button at the bottom through which he can add all ingredients required to his personal shopping list. For those ingredients that are already present there, their quantity is increased by the required number in the recipe.

    Consider the user has the above items in his grocery list, Flour, Cocoa Powder, and Baking soda. Then he is shown recommended recipes based on the data, these are Chocolate Cake, Pie, Cupcakes, and so on. The user clicks on Chocolate cupcakes. He is then shown multiple options like what is the nearest store provides them, how to make them, what is their scale (inches), and then what are the ingredients that he will need along with the quantity.

    Now based on the design of the app, the user will be able to add ingredients individually to his list or have an option titled import ingredients that imports all ingredients to the list made by the customer. As the size of the list increases the product set matches more and more recipe sets and more recipes can be predicted and suggested.


In a Grocery Store

Consider the same customer now standing in front of a fridge in the grocery store. He can see many products from its glass window, and he is confused about which product to pick, or he forgets a product or is simply trying a new recipe. So, he opens the app, takes a picture of the items in the fridge, or of a particular item, or he types it in his list. Then he can see all the recipes that have that particular item used in them, starting from the most popular or most related depending on the search history of the customer.

Or if there is a list of items on a shelf or on the door of the fridge, the customer can use Optical Character Recognition for automatic type text based on the printed one or can type it himself. The above process repeats and he gets recommended products or recipes as per his needs. The items in the fridges or pantries can easily be hand typed in a file, and a dataset can also be downloaded from here.


Benefits of such a system

Having such systems in the market have great benefits for both customers and businesses. Some of them are listed as;

  1. As a Customer:

    The customer can have a lot of benefits by using such a handy system. These can reduce their day-to-day problems and provide an all-in-one solution. This app would allow them to

    • list their grocery items

    • check availability of products at multiple grocery stores

    • suggest products relevant to most purchased products (both regional and personal)

    • suggest recipes that they can try with the items they have listed

    • see recommended recipes

    • view and add ingredients of these recipes to their list

    • easily manage lists history

    • purchase items online.


    Thus the customers will be more than happy to invest their time in this all-in-one application that makes their life easier.



  1. Business Side

    Business owners are the ones who will truly benefit from this system – they can pay a particular price for showcasing their product/recipes as an ingredient. This data analysis based on customer purchases from this app can also be a great asset for providing an advertisement to a targeted audience, billboards, or commercials in regional perspective and video or image ads from a personal perspective.

    The app can also subconsciously influence a customer into purchasing a specific product. For instance, if a customer has been seeing recipes and product recommendations related to a targeted product, he invariably will pick up the product when he goes shopping! This happens even if he doesn’t need the product as his mind has been tricked into buying it.

    Thus the business owners can increase sales and save a lot of costs on irrelevant advertisements.


Well for a data scientist at Walmart or Kroger, this is just touching the tip of the iceberg but with Datom by your side, you can connect the dots, ingest data from multiple sources and in multiple formats, is easy-to-use, requires no coding, brings data to one spot which then helps you with research, helps analyze data consistently and fed into another system and most importantly, Datom makes your life easier as a data scientist.

All you need to do is reach out to

Datom offers an automated, code-free integration for data housed in major platforms to Azure/Snowflake/Amazon Redshift or Google BigQuery. All data transfers are done with a drag and drop interface and are based on a transparent pricing mechanism based on actual usage.



All the codes used here are available at GitHub and can be found here

The media shown in this article are not owned by Analytics Vidhya and is used at the Author’s discretion. 

Mrinalini Sundar 16 Apr 2021

Frequently Asked Questions

Lorem ipsum dolor sit amet, consectetur adipiscing elit,

Responses From Readers


  • [tta_listen_btn class="listen"]