Portfolio

Instacart

Objective

Perform an initial data and exploratory analysis of some of their data in order to derive insights and suggest strategies for better segmentation based on the provided criteria.

Problem

Instacart is considering a targeted marketing strategy.  In order to target the right customer profiles with the appropriate products, they want to learn more about their sales trends and purchasing behaviour.

Data

Techniques Applied

  • Data Cleaning: Wrangling
  • Combining and Exporting Data
  • Grouping Data and Aggregating Variables
  • Python Visualization
  • Excel Report

Tools

The Process

The data preparation phase

The first step was to clean up the Instacart raw data and CF generated data (checked for missing values, inconsistent data types, and removed columns that would not be used) and merge the data sets. The questions that needed to be answered were known in advance, so I derived the variables that I would need to perform the corresponding analysis.

The analysis phases

To give structure to the list of questions, I decided to divide the analysis into four sections: time analysis, product analysis, customer analysis, and customer profile. Given the limitation of the data set, each section had its own set of restrictions that I had to overcome.

The results

Although the analysis was based on sound reasoning, the limitations of the data were visible in the results. The parties were informed based on the results of the analysis with a clear disclaimer regarding the initial limitations.

Time Analysis

Taking into account that 0 on the X axis corresponds to Sunday, the largest purchases are made on Sundays. Tuesday and Wednesday, however, are the lowest days compared to the other days of the week.

On the other hand, the analysis involving the busiest hours of the day showed that 10am – 3pm are the highest shopping peaks during the day.

Products Analysis

Bar graphs that show the distribution of product number by price, where it can be seen that consumers prefer to buy products that cost less than $15. Ideally, you would determine the popularity of a department based on the quantities of products sold, but since that data is unknown, order departments by the total number of times products were ordered from a specific department.

Costumers Analysis

The consumer profile could be defined despite the known bias in the initial data. I chose to group them according to age, where the largest number of consumers were between 41 and 65 years old, followed by the youngest between the ages of 18 and 40.

Costumers Profiling

The task was to compare ordering habits across customer criteria, the analysis showed no significant differences between subsets. I decided to focus on income, and to group it according to region and income. However, this graph shows that most of the clients come from a middle income. There is not much difference in region type within the income group.

Recommendations

The busiest day of the week is Saturday followed by Sunday, taking this data into account, more advertisements and marketing projects should be started for the days that are lower, such as Tuesdays and Wednesdays. New parents and retired adults are the customers who buy the least, so one option could be the possibility of delivery, and thus be able to make it easier for them to make purchases, in addition to some discounts or exchange or bonus points. As customer profiles become more targeted, less frequented departments should be included if it is shown that the target group prefers items from that department