Chicago Cycling Data Analytics Project Report

A comprehensive data analytics project analyzing Cyclistic's bike usage patterns to develop data-driven strategies for increasing annual memberships through MySQL data cleaning and insightful Tableau Dashboard visualizations.
Live Project

Table of Contents

  1. Key Questions Answered by the Dashboard
  2. Statement of the Business Task
  3. Data Sources Used
  4. Data Cleaning and Manipulation
    • Handling Missing Values
    • Removing Outliers
    • Data Transformation
  5. Summary of Analysis
    • Total Rides Per Month
    • Rides Per Member
    • Rideable Type Distribution
    • Monthly Trends
    • Weekly Trends
    • Average Ride Duration
    • Hourly Trends
  6. Top Three Recommendations
    • Incentivizing Membership Conversion
    • Optimizing Marketing at Peak Times
    • Enhancing Rideable Type Availability

Key Questions Answered by the Dashboard:

  1. How do annual members and casual riders use Cyclistic bikes differently? 
  2. Why would casual riders buy Cyclistic annual memberships? 
  3. How can Cyclistic use digital media to influence casual riders to become members?

Statement of the Business Task

Cyclistic, a Chicago-based bike-share company, is focused on increasing its number of annual memberships. To support this goal, the marketing analyst team needs to understand how annual members and casual riders use Cyclistic bikes differently. Specifically, the task is to analyze riding patterns to develop a strategy that converts casual riders into annual members.

Data Sources Used

The analysis was conducted using data from Cyclistic’s bike-share system, which includes detailed ride data. The data was processed using MySQL, and the insights were visualized through Tableau Dashboards. The key data points analyzed include:

Creating table

CREATE TABLE bike_ride(
   ride_id VARCHAR(255),
    rideable_type VARCHAR(255),
    started_at DATETIME,
    ended_at DATETIME,
    start_station_name VARCHAR(255),
    start_station_id VARCHAR(255),
    end_station_name VARCHAR(255),
    end_station_id VARCHAR(255),
    start_lat FLOAT,
    start_lng FLOAT,
    end_lat FLOAT,
    end_lng FLOAT,
    member_casual VARCHAR(255)
);

Importing Data

Use Cyclistic’s historical trip data to analyze and identify trends. Download the previous 12 months of Cyclistic trip data here. (Note: The datasets have a different name because Cyclisticis a fictional company. For the purposes of this case study, the datasets are appropriate and will enable you to answer the business questions. The data has been made available by Motivate International Inc. under this license.) This is public data that you can use to explore how different customer types are using Cyclistic bikes. But note that data-privacy issues prohibit you from using riders’ personally identifiable information. This means that you won’t be able to connect pass purchases to credit card numbers to determine if casual riders live in the Cyclistic service area or if they have purchased multiple single passes.

Data Cleaning and Manipulation

Before conducting any analysis, it was essential to clean and prepare the data to ensure accuracy.

Handling Missing Values

Missing or incomplete data entries were identified and removed to prevent any skewed results. The SQL code for this process was:

SELECT *
FROM bike_ride
WHERE 
    start_station_id IS NULL OR start_station_id = '' OR
    end_station_id IS NULL OR end_station_id = '' OR
    start_station_name IS NULL OR start_station_name = '' OR
    end_station_name IS NULL OR end_station_name = '' OR
    start_lat IS NULL OR start_lat = '' OR
    start_lng IS NULL OR start_lng = '' OR
    end_lat IS NULL OR end_lat = '' OR
    end_lng IS NULL OR end_lng = '' OR
    ride_id IS NULL OR ride_id = '' OR
    rideable_type IS NULL OR rideable_type = '' OR
    member_casual IS NULL OR member_casual = '';
    
DELETE FROM bike_ride
WHERE 
    start_station_id IS NULL OR start_station_id = '' OR
    end_station_id IS NULL OR end_station_id = '' OR
    start_station_name IS NULL OR start_station_name = '' OR
    end_station_name IS NULL OR end_station_name = '' OR
    start_lat IS NULL OR start_lat = '' OR
    start_lng IS NULL OR start_lng = '' OR
    end_lat IS NULL OR end_lat = '' OR
    end_lng IS NULL OR end_lng = '' OR
    ride_id IS NULL OR ride_id = '' OR
    rideable_type IS NULL OR rideable_type = '' OR
    member_casual IS NULL OR member_casual = '';

Removing Outliers

To maintain the integrity of the analysis, rides with unrealistic durations were removed. The SQL code used for this step was:

DELETE FROM bike_ride
WHERE TIMESTAMPDIFF(HOUR, started_at, ended_at) < 0
   OR TIMESTAMPDIFF(HOUR, started_at, ended_at) > 24;

Data Transformation

Additional columns were created to enhance the analysis, such as ride duration and the day of the week:

ALTER TABLE bike_ride
ADD COLUMN ride_duration INT;

UPDATE bike_ride
SET ride_duration = TIMESTAMPDIFF(MINUTE, started_at, ended_at);

ALTER TABLE bike_ride
ADD COLUMN day_of_week VARCHAR(50);

UPDATE bike_ride
SET day_of_week = DAYNAME(started_at);

Summary of Analysis

The data analysis was divided into several key areas to understand how casual riders and annual members use Cyclistic bikes differently.

Top Three Recommendations with Actionable Examples

  1. Incentivizing Membership Conversion

    Recommendation:
    Cyclistic should create a targeted campaign to encourage casual riders to convert to annual memberships by offering tailored incentives based on their riding behavior.

    Actionable Example:

  1. Optimizing Marketing at Peak Times

    Recommendation:
    Utilize data-driven insights about peak usage times to optimize digital marketing efforts, targeting casual riders when they are most likely to be active.

    Actionable Example:

  1. Enhancing Rideable Type Availability

    Recommendation:
    Adjust the fleet distribution to ensure that the most popular rideable types, particularly electric bikes, are readily available during peak usage times and at high-demand stations.

    Actionable Example:

These actionable strategies are designed to leverage the data insights gathered from the analysis, ensuring that Cyclistic not only increases its annual memberships but also enhances overall customer satisfaction and loyalty. By aligning marketing efforts with the identified behavior patterns of casual riders, Cyclistic can effectively drive membership growth and improve the user experience across the board.

Related Data-Driven Projects:

Have a Project in Mind? Let’s Collaborate and Bring It to Life

Embark on a journey where creativity meets precision, turning your design visions into dynamic realities.