Rockbuster Stealth Customer Behavior Analysis
Project Overview
This project aims to leverage the power of data analysis to optimize Rockbuster Stealth's online video rental service. By utilizing SQL to extract historical rental data and creating insightful Tableau visualizations, we will gain valuable insights into customer behavior and preferences.
This includes identifying top-performing movies and understanding customer demographics. Additionally, we will identify high lifetime value customers, allowing us to develop targeted customer retention programs that foster loyalty and continued engagement.
Data Source
Rockbuster Data Set - This dataset was created by CareerFoundry for the purposes of this study.
Tools & Skills
Language: SQL (PostgreSQL)
Software: pgAdmin. DbVisualizer, Tableau & Excel
Relational Databases, SQL Database Querying (Filtering, Joins, Subqueries and CTE)
Data Visualizations
Key Questions
Which movies contributed the most/least to revenue gain?
What was the average rental duration for all videos?
Which countries are Rockbuster customers based in?
Where are customers with a high lifetime value based?
Do sales figures vary between geographic regions?
Analysis Process
1. Explore Database
Diagram the interrelationships between all tables using an entity relationship diagram (ERD).
2. Create Data Dictionary
Identify the metadata such as object name, data type, size, classification, and relationships with other data assets.
3. Descriptive Statistics
Leverage SQL queries to checking and cleaning the data, conduct descriptive statistics analysis.
4. Answer Business Questions
Perform Joins, Subqueries, and CTEs to answer business questions and refine queries.
5. Visualize Data & Present Results
Utilize Tableau create visualizations and PowerPoint to present insights and recommendations.
ERD & Data Dictionary
To create the dictionary, we first extracted the schema of Rockbuster’s database using DbVisualizer, and then determined the fact tables and dimension tables.
The data dictionary meticulously details each table's column names, data types, and descriptions. It also includes the tables linked to a specific table and the foreign keys connecting them.
SQL Query
This SQL query utilizes a Common Table Expression (CTE) to identify Rockbuster's top customers from the top 10 cities based on the total number of rentals. It first calculates the total rentals for each customer in each city and then selects the customers from the top 10 cities with the highest rental counts. The query helps Rockbuster to pinpoint its most valuable customers in the most significant cities, aiding in targeted marketing and service improvements. For more SQL queries, click the button below to view the GitHub repository.
Key Insights
Top 10 Countries
The Majority of Rockbuster Stealth customers live in the top ten countries, led by India and China.
There is a wide range of sales figures between each region. India has the highest sales with over $6,000, followed by China and the United States. To see Interactive Map Please click to review at Tableau
Recommendations
Maintaining a balance between new releases and popular movies inventory based on the average rental duration of five days.
Use the Top 10 movies as a guide to determine which genres, actors, and languages resonate with our audience.
Explore offering Bollywood and Chinese films to cater directly to India and China's demographics.
Offer exclusive benefits such as early access to new releases or discounted rental rates to retain high lifetime value customers.
Reflection
Growth in SQL proficiency: This project helped me solidify my understanding of core SQL concepts (joins, aggregations, filtering) and exposed me to more advanced techniques (CTEs, subqueries) that I can explore further.
Data schema complexity: Navigating complex database schemas and understanding table relationships was initially challenging, but I learned to effectively utilize data dictionary and diagram.
Error handling and troubleshooting: Debugging errors in my SQL queries was frustrating at times, but it helped me develop critical thinking and problem-solving skills.
Data limitations: The dataset was relatively small, with only 599 records. This limited the depth of analysis possible, making it difficult to uncover nuanced customer behavior patterns or identify long-term trends.