Overview

The carpool optimization algorithm clusters users into valid carpool groups based on availability, location, and vehicle capacity. It utilizes DBSCAN (Density-Based Spatial Clustering of Applications with Noise) for clustering and ensures valid and practical carpools with additional validation and corrections.


Inputs

The optimizer takes a JSON input with the following structure:

{
    "carpoolId": "001",
    "carpoolName": "Midtown Soccer Club",
    "carpoolLocation": {
        "name": "Midtown High School",
        "address": "929 Charles Allen Dr NE",
        "city": "Atlanta",
        "state": "GA",
        "zipCode": "30309"
    },
    "carpoolDays": [2],
    "carpoolMembers": [1, 2, 3],
    "availabilities": [
        { "userId": "1", "availability": [0, 2, 4, 5] },
        { "userId": "2", "availability": [0, 1, 4, 6] },
        { "userId": "3", "availability": [1, 3, 4, 5, 6] }
    ],
    "users": [
        {
            "userId": "1",
            "name": "Selina Meyer",
            "numchildren": 2,
            "children": ["Neha", "Shriya"],
            "carCapacity": 4,
            "location": {
                "address": "120 North Avenue NW",
                "city": "Atlanta",
                "state": "GA",
                "zipCode": "30313"
            }
        }
    ]
}

Algorithm Steps

1. Initialization

  • Extract user data and carpool details.
  • Convert addresses to latitude/longitude using an API.
  • Assign user availability and carpool days.
  • Define algorithm parameters such as min_cluster_size, max_cluster_size, and target_size.

2. DBSCAN Clustering

  • Compute the Haversine distance between all pairs of users.
  • Determine the eps value as the 5th-10th percentile of all distances.
  • Run DBSCAN with getNeighbors() and expandCluster() helpers.
  • Calculate cluster centroids using calculateCentroid().
  • Ensure no user is both clustered and unclustered.

3. Validation & Splitting

  • Validate clusters based on:
    • Car capacity: Ensure at least one user can drive all members.
    • Availability: At least one driver is available per carpool day.
    • Distance Threshold: Ensure all users in a cluster are within eps * distance_threshold_factor.
  • Split large clusters into subclusters using splitLargeCluster():
    • Find the user minimizing the sum of distances to all others.
    • Form valid subclusters with nearest users.
    • Uncluster invalid formations.
  • Uncluster invalid clusters.

4. Finalization & Schedule Creation

  • Validate clusters again.
  • Assign driving schedules:
    • Map {day -> [drivers]} and {driver -> count}.
    • Assign the least-used eligible driver per day.
  • Attempt to re-cluster unclustered users by selecting the user with the lowest sum of distances.
  • Loop until no more users can be clustered.

Outputs

The algorithm returns:

  • Initial Clusters (Raw DBSCAN Output)
  • Validated Clusters (After Splitting & Validation)
  • Final Clusters (With Driving Schedule)
  • Unclustered Users (Accumulated throughout validation)

Example output:

{
    "finalClusters": [
        {
            "users": ["1", "2"],
            "schedule": [
                { "userId": "1", "drivingDays": [1, 3] },
                { "userId": "2", "drivingDays": [2, 4] }
            ]
        }
    ],
    "unclusteredUsers": ["3"]
}