Optimizer
Optimization
Information regarding the optimization algorithm of Poolerz.
Overview
The carpool optimization algorithm clusters users into valid carpool groups based on availability, location, and vehicle capacity. It utilizes DBSCAN (Density-Based Spatial Clustering of Applications with Noise) for clustering and ensures valid and practical carpools with additional validation and corrections.
Inputs
The optimizer takes a JSON input with the following structure:
Algorithm Steps
1. Initialization
- Extract user data and carpool details.
- Convert addresses to latitude/longitude using an API.
- Assign user availability and carpool days.
- Define algorithm parameters such as
min_cluster_size
,max_cluster_size
, andtarget_size
.
2. DBSCAN Clustering
- Compute the Haversine distance between all pairs of users.
- Determine the
eps
value as the 5th-10th percentile of all distances. - Run DBSCAN with
getNeighbors()
andexpandCluster()
helpers. - Calculate cluster centroids using
calculateCentroid()
. - Ensure no user is both clustered and unclustered.
3. Validation & Splitting
- Validate clusters based on:
- Car capacity: Ensure at least one user can drive all members.
- Availability: At least one driver is available per carpool day.
- Distance Threshold: Ensure all users in a cluster are within
eps * distance_threshold_factor
.
- Split large clusters into subclusters using
splitLargeCluster()
:- Find the user minimizing the sum of distances to all others.
- Form valid subclusters with nearest users.
- Uncluster invalid formations.
- Uncluster invalid clusters.
4. Finalization & Schedule Creation
- Validate clusters again.
- Assign driving schedules:
- Map
{day -> [drivers]}
and{driver -> count}
. - Assign the least-used eligible driver per day.
- Map
- Attempt to re-cluster unclustered users by selecting the user with the lowest sum of distances.
- Loop until no more users can be clustered.
Outputs
The algorithm returns:
- Initial Clusters (Raw DBSCAN Output)
- Validated Clusters (After Splitting & Validation)
- Final Clusters (With Driving Schedule)
- Unclustered Users (Accumulated throughout validation)
Example output: