Methodology

How we collect and calculate the data shown on Robotaxi Tracker

Overview

Robotaxi Tracker is a community-driven platform that aggregates data about autonomous vehicle fleets through user submissions, automated data collection, and statistical modeling. This page explains how each type of data is collected, processed, and displayed on the site.

All data shown on the site represents only vehicles and trips that have been discovered and reported by the community. The figures do not represent official fleet totals from service providers and may significantly undercount the actual fleet size.

Vehicle Data Collection

Discovery Process

Vehicles are added to the tracker through community submissions. When a user spots a robotaxi, they can:

Sign in and submit the license plate number (validated by provider-specific formats)
Upload at least one photo of the vehicle for verification (up to three)
Specify exterior and interior colors (Tesla submissions only; Waymo uses defaults)
Indicate the service area where it was spotted
Add optional notes about the sighting

Verification and Review

All vehicle submissions enter a moderated queue and are reviewed by administrators before being added to the tracker. The review process includes:

Verification of license plate format and validity
Review of submitted photos for authenticity
Checking for duplicate submissions
Validation of service area assignment

Once approved, the vehicle is added to the database with a "first spotted" timestamp. Verification photos are deleted after approval to protect user privacy.

Spottings and Color Suggestions

For vehicles already in the fleet, signed-in users can submit spottings and color corrections:

Spottings: Require at least one photo and a sighting date. Approved spottings can update a vehicle's first/last spotted dates.
Color Suggestions: Let users suggest exterior/interior color changes (photos optional) that are reviewed by admins.

Vehicle Attributes

For each vehicle, we track:

License Plate: Unique identifier for the vehicle
First Spotted: Date and time when the vehicle was first discovered
Last Spotted: Most recent sighting or trip report
Exterior Color: Vehicle exterior paint color
Interior Color: Vehicle interior color
Service Area: Primary service area where the vehicle operates
Provider: Service provider (Tesla or Waymo)
Total Trips: Number of trips logged for this vehicle

Trip Data Collection

Trip Logging

Users can log trips they take in robotaxis. Each trip submission includes:

Signed-in user account
Vehicle license plate (selected from discovered vehicles)
Trip distance (stored in miles; entries in kilometers are converted)
Optional trip date (defaults to the current date if blank)
Service area where the trip took place
At least one screenshot/photo for verification (up to three)

Trip Verification

Trip submissions enter a moderated queue and are reviewed by administrators to ensure:

Vehicle exists in the tracker
Trip distance is reasonable
Date is valid and not in the future
Screenshot/photo supports the submission

Approved trips are added to the database and contribute to vehicle statistics, service area totals, and community leaderboards. Verification screenshots are deleted after approval.

Trip Statistics

Trip data is aggregated to calculate:

Total miles logged across all trips
Total number of trips
Average trip distance
Trips per vehicle
Miles per service area

Important: Trip statistics represent only approved trips reported by users on this site. They do not represent the total miles or trips for the entire fleet.

Wait Time Data Collection

Data Source

Wait time data is collected through automated monitoring of service availability. The system captures:

Service status (available, high demand, or unavailable)
Estimated wait time in minutes (when available)
Timestamp of each status check
Source of the data point

We collect Tesla wait times from fixed reference points within each service area. Each collection cycle captures multiple points and is grouped as a single run.

Availability Metrics

Wait time analytics calculate:

Percentage of time service is available
Average wait time when service is available
Peak wait times
Best and worst times to request a ride
Hourly and daily availability patterns

Availability is counted as "available" only when a numeric ETA is returned. Points marked as high demand, unavailable, or error are counted as unavailable.

Wait time data is currently collected for Tesla service areas in Austin and the Bay Area. For Austin, analytics exclude the legacy 2am–6am downtime window only before the switch to 24/7 service on December 11, 2025 (Austin local time).

Estimated Active Vehicles (ETA-Based)

We publish an "Est. Active Vehicles" figure derived solely from the unified wait time ETAs. The goal is to infer the minimum active fleet size needed to explain the observed ETA pattern at the time of collection. This is not the total fleet size and will generally be lower than the full fleet when vehicles are offline, charging, or outside the sampled area.

Recent window: We take the latest run timestamp and include points from the last 30 minutes.
Point aggregation: For each point label, we use the median ETA from that window. Unavailable points (null ETA, high demand, or error) are excluded from the ETA estimate.
ETA floor/cap: We apply percentile floors and caps to reduce noise and outliers before conversion (default floor: 20th percentile, cap: 90th percentile).
ETA → coverage radius: Each ETA becomes a coverage radius using a speed and circuity factor to approximate road-network travel time.
Greedy set cover (upper bound): We assume vehicles can be positioned at observed points and pick the smallest set that can "cover" all points within their radii.
Greedy packing (lower bound): We select the largest set of points that cannot be covered by the same vehicle (mutually incompatible), which gives a minimum active-vehicle bound.
Final estimate: We take the geometric mean of the bounds as the mid estimate and publish the low/high range.
Confidence: Based on number of available points, ETA variance, and how wide the low/high range is.

Important limitations: The estimate is sensitive to ETA inflation, sparse sampling, and uneven coverage. It measures the minimum active vehicles implied by the ETAs at that moment, not the full fleet size.

Estimated Available Vehicles (Homepage Widget)

The homepage displays an "Est. Available" figure that estimates how many vehicles are currently idle and ready to accept new ride requests. This is distinct from the "Est. Active" calculation above, which estimates the total active fleet size.

The Core Challenge

When you see a 6-minute ETA, it could mean:

An idle car 6 minutes away (truly available)
A car finishing a ride in 4 minutes, then 2 minutes to reach you
A car repositioning that will arrive in 6 minutes

We cannot directly distinguish these cases from ETA data alone. However, we can make intelligent inferences using multiple signals.

Signal 1: Minimum ETA (Primary Indicator)

Key insight:Very short ETAs (< 2-3 minutes) strongly indicate truly idle cars. A car cannot accurately predict when it will finish a ride AND drive to you in under 3 minutes—that precision only comes from an idle car that's already nearby.

We calculate an "idle confidence" score based on the minimum ETA across all sampling points:

idleConfidence = max(0, 1 - minEtaMinutes / 10)

Min ETA = 0 min → idleConfidence = 1.0 (high confidence in idle cars)
Min ETA = 3 min → idleConfidence = 0.7
Min ETA = 7 min → idleConfidence = 0.3 (likely all in-transit)
Min ETA ≥ 10 min → idleConfidence = 0 (all cars likely busy)

Signal 2: System Load (Secondary Indicator)

The average ETA across all points reflects overall system demand. Low averages indicate slack in the system—cars have time between rides—while high averages suggest cars are immediately picked up after dropping off passengers.

systemSlack = max(0, 1 - avgEtaMinutes / 12)

Signal 3: Geographic Coverage

More available sampling points with service suggests better fleet distribution. We use the square root to account for overlapping coverage areas (nearby points likely share the same available cars).

coverageFactor = √availablePointCount

Combined Estimate

We weight the idle confidence heavily (70%) since it's our best signal for truly idle vehicles, with system slack as a secondary factor (30%):

combinedScore = idleConfidence × 0.7 + systemSlack × 0.3
rawEstimate = combinedScore × coverageFactor × 1.5
estimate = round(rawEstimate), bounded by [minBound, maxBound]

Bounds and Edge Cases

Minimum bound: Returns at least 1 if there's strong availability (minETA < 5 and avgETA < 8), otherwise can return 0
Maximum bound: Capped at 50% of available points (can't have more idle cars than coverage suggests)
Zero case: Returns 0 (and hides the widget) when all signals indicate no idle cars

Example Calculations

Scenario A: 12 points, minETA=2min, avgETA=5min
idleConfidence = 1 - 2/10 = 0.8
systemSlack = 1 - 5/12 = 0.58
combined = 0.8×0.7 + 0.58×0.3 = 0.73
estimate = 0.73 × √12 × 1.5 ≈ 4 vehicles

Scenario B: 10 points, minETA=8min, avgETA=10min
idleConfidence = 1 - 8/10 = 0.2
systemSlack = 1 - 10/12 = 0.17
combined = 0.2×0.7 + 0.17×0.3 = 0.19
estimate = 0.19 × √10 × 1.5 ≈ 1 vehicle (floor of 1 due to availability)

Scenario C: 8 points, minETA=12min, avgETA=14min
idleConfidence = 0 (min ≥ 10)
systemSlack = 0 (avg ≥ 12)
estimate = 0 vehicles (widget hidden)

Limitations:This model cannot perfectly distinguish idle from in-transit vehicles. It works best when there's variance in ETAs across points. The estimate updates in real-time as wait time data refreshes (approximately every 60 seconds).

Fleet Statistics and Calculations

Vehicle Counts

The homepage displays "Rider Vehicles" as the primary count, representing vehicles that are actively providing rides to customers. This count:

Includes only vehicles submitted and approved by the community
Excludes vehicles marked as test vehicles (see below)
Is filtered by service area when a specific area is selected
Can be filtered by provider (Tesla or Waymo)
Updates in real-time as new vehicles are discovered

Below the rider count, we display "of X total" which shows the complete count including all discovered vehicles (rider + test).

Limitation: This count represents only discovered vehicles, not the total fleet size. The actual fleet may be significantly larger.

Rider Vehicles vs. Test Vehicles

We distinguish between two types of vehicles in the fleet:

Rider Vehicles: Vehicles actively providing rides to customers through the public service. These are the vehicles you can request through the Tesla or Waymo app.
Test Vehicles: Vehicles used for development, testing, and validation purposes. These vehicles are not available for public rides.

What counts as a test vehicle?

Cybercab prototypes:Tesla's next-generation robotaxi design (no steering wheel or pedals) currently undergoing testing
Fully autonomous test cars: Vehicles operating without safety drivers as part of testing programs before public deployment
Development vehicles: Cars used by engineering teams to test new software features, sensors, or hardware configurations
Validation fleet: Vehicles used for regulatory compliance testing and safety validation

Administrators mark vehicles as test vehicles during the review process based on visible identifiers (e.g., "TEST VEHICLE" markings, Cybercab body style, known test fleet plates) or information from reliable sources.

Fleet Growth Tracking

Fleet growth charts use different data sources by provider:

Tesla: cumulative discovered vehicles over time, based on each vehicle's "first spotted" date
Waymo: officially reported fleet milestones over time from Waymo statements and major news outlets
Tesla charts include service-area splits, cumulative totals, and fleet density

Historical "first spotted" dates may be approximate, especially for Tesla vehicles discovered early in the tracking effort. Dates are refined over time as more accurate information becomes available.

Fleet Mileage Estimates

Anchor + Active-Vehicle Weighted Model

For Austin, estimated cumulative mileage is now computed using an anchor-constrained model that blends official mileage milestones with tracker-observed active-vehicle activity. This approach:

Anchors to official cumulative mileage points confirmed by Tesla
Builds a daily active-vehicle signal from tracker data (vehicle active on each day from first_spotted through last_spotted + 30 days)
Smooths that daily signal with EWMA and applies a conservative phase-level fleet floor
Allocates miles between anchor pairs proportionally to the smoothed active-vehicle weights so anchor totals are exactly preserved
After the latest anchor, nowcasts daily mileage from blended anchor productivity, adjusted by recent active-vehicle trend
Publishes a bounded confidence band for projected (post-anchor) days based on anchor-segment productivity variability

Core Formulas

Smoothing: S_t = 0.25 * A_t + 0.75 * S_(t-1)
Effective activity floor: effective_t = max(S_t, phaseFallback_t)
Anchor segment allocation: dailyMiles_t = ΔM_i * (effective_t / W_i), where W_i = Σ effective_t across (anchor_i, anchor_(i+1)]
Post-anchor nowcast: dailyMiles_t = P_blend * effective_t * T_t, where T_t = clamp(S_t / EMA28(S)_t, 0.90, 1.10)
Projected confidence: bandPct = clamp(std(P_i) / mean(P_i), 0.10, 0.25)

Known Data Points

The curve is constrained by official Tesla-reported cumulative paid miles, including:

Jul 23, 2025: 7,000 miles
Oct 22, 2025: 250,000 miles
Dec 31, 2025: 650,000 miles

Important: These are estimates based on modeling, not official figures. Actual fleet mileage may differ significantly from these estimates.

Data Accuracy and Limitations

Community-Driven Data

All data on Robotaxi Tracker is community-driven, which means:

We can only track vehicles that have been discovered and reported
Vehicle counts represent discovered vehicles, not total fleet size
Trip data represents only trips logged by users, not all trips
Some vehicles may be discovered multiple times before being added
Historical dates may be approximate, especially for early discoveries

Waymo Fleet Data

Waymo data collection on this site is still early, but the fleet growth chart now uses official reporting milestones:

Chart totals are based on publicly reported official milestones (Waymo + major outlets)
Our community "discovered vehicles" data for Waymo remains incomplete
Coverage still varies significantly by service area for crowdsourced entries
City-level completeness will improve as Waymo detection coverage expands

Data Quality Measures

To maintain data quality, we:

Review all submissions before adding them to the tracker
Verify license plates and check for duplicates
Require photos or screenshots for verification
Allow users to suggest corrections (color updates, etc.)
Maintain audit trails for data changes
Regularly clean and deduplicate data

Data Sources Summary

Primary Sources

Community Submissions: Vehicle discoveries, trip logs, spottings, and color suggestions
Automated Monitoring: Tesla service availability and wait time data
Official Data Points: Confirmed mileage figures used for estimation models

Data Processing

All submissions reviewed by administrators
Data stored in Supabase database
Statistics calculated in real-time from database queries
Charts and visualizations generated client-side

Data Updates

Vehicle and trip data updates as new submissions are approved
Wait time data updates automatically (frequency varies by data source)
Statistics recalculated on each page load
Historical data refined as more accurate information becomes available

Contributing Data

You can contribute to Robotaxi Tracker by:

Submitting vehicles you discover
Logging trips you take
Reporting spottings for existing vehicles
Suggesting color corrections for existing vehicles
Providing feedback on data accuracy

All contributions help improve the accuracy and completeness of the tracker. Thank you for being part of the community!

Questions or Feedback

If you have questions about our methodology or suggestions for improvement, please contact us at:

moc.annakcmnahte@tcatnoc

Last updated: May 18, 2026