Tennis Ball Tracking on a Phone — Yes, Really

TrackNet-based ball trajectory tracking on phone-recorded tennis video. What it gets right, what it doesn't, and how it compares to in-stadium Hawk-Eye.

In plain English: ball tracking is the part of AceSense that finds the tennis ball in every frame of your video and reconstructs its trajectory through the air. Where it bounces, how fast it's moving, what curve it took to get there — that's all derived from ball tracking. Without it, the court heatmap is empty, the speed numbers don't exist, and shot-classification's accuracy drops sharply because it can't use ball trajectory as a feature.

This page is what ball tracking actually does, how good it is, and the angles where it falls over.

What it does, in one paragraph

AceSense uses a TrackNet-derived ball detector — TrackNet is the open-source approach the racket-sports computer-vision community converged on for tracking small fast objects through partial occlusion. The model takes consecutive video frames as input and produces a per-frame ball position. From that, the pipeline reconstructs the 3D trajectory using the court-keypoint scale (so we know how big the court is in pixels), estimates the bounce points (where the trajectory's z-coordinate crosses the court plane), and feeds the trajectory into shot classification, the heatmap, and the speed calculation. The full pipeline is documented at /how-it-works; this page is the ball-tracking layer specifically.

How accurate it is

Honest answer:

  • Per-frame ball detection: in the high 90s on well-lit, behind-baseline phone footage at 1080p. The ball is a 5–15 pixel yellow object — TrackNet was literally designed for this.
  • Trajectory reconstruction: the model misses some frames during fast serves (the ball can travel more than its own width per frame), and it interpolates. Bounce-point accuracy is typically within 30–50 cm of the true bounce on the heatmap — good enough to tell you "your forehand is landing in the middle third" with confidence, not good enough to call lines.
  • Speed estimation: within 5–10% on serves and groundstrokes shot from the recommended camera angle. We don't quote sub-mph numbers and we're explicit when the speed is a low-confidence estimate (e.g. very short rally, heavy occlusion).

Full methodology, including how we measure these against hand-annotated ground truth, is on the /accuracy page. We publish the regression suite results — no other consumer tennis app does, and the serve-speed disbelief threads on Reddit are exactly the reason. If you've ever wondered whether your tennis app's "130 mph serve" reading was real, the answer is: maybe, but the methodology was never published. Ours is.

The deep methodology piece is at /blog/how-accurate-is-acesense. Read it before you trust any number.

Where it fails

Three real failure modes:

1. Heavy backlight or low-contrast lighting

If you film at sunset with the sun behind the player, the ball can blow out into the highlights or get lost in shadow. TrackNet's per-frame detection rate drops, the trajectory reconstruction interpolates more, and bounce-point estimation degrades. The fix is the same as for any phone video: don't shoot directly into the sun. Indoor floodlit courts and morning/midday sun are fine. Late-afternoon backlight is the killer.

2. Very low resolution or heavily compressed video

The ball is small. If your phone records at 480p, or if the video has been re-compressed by a messaging app before upload, the ball can drop below the resolution threshold the model needs. Upload from your phone's native gallery, not from WhatsApp or Telegram (which re-encode aggressively). 1080p at 30 fps is the comfortable floor; 4K is fine; below 720p, accuracy drops noticeably.

3. Net-cord skim and the moment of bounce

When the ball clips the net cord and changes direction sharply, the trajectory reconstructor sometimes misses the inflection. The bounce-point estimation can land on the wrong side of the net for a few cases. Similarly, the moment of bounce itself — when the ball hits the court — is occasionally estimated 1–2 frames late, which translates to a few centimetres of bounce-position error on the heatmap. This is a known limitation; we mark these shots as low-confidence in the report.

There are smaller failure modes (a second ball in frame from the next court over, a player wearing a yellow shirt, a recently-mowed pollen-yellow court surface) that can spike false detections. These are rare in real club tennis but they exist.

Why this is the right framing for an amateur player

Here's the thing about ball tracking for an amateur player: you don't need Hawk-Eye accuracy. You need honest accuracy.

A 30 cm error on a bounce point doesn't matter when the question is "do my forehands land in the middle third or the back third?" — those zones are 4 metres deep. A 7% error on a serve speed reading doesn't matter when the question is "is my second serve 20 mph slower than my first?" — that gap is 30 mph.

What matters is the trustworthiness of the direction the numbers move. If your serve speed reading goes up across four weeks of practice, that's a real change — both serves are subject to the same systematic error, so the delta is reliable even when the absolute number isn't. The same is true for bounce-zone consistency, rally-shot speed, and trajectory shape.

This is the framing competing apps haven't gotten right. They market the giant "139 MPH SERVE" number, the player goes to Reddit and posts "Am I really serving 130mph?", and trust collapses. We market the deltas and the zones — the things you can actually use to improve — and we publish the methodology so you can audit us.

The downstream features that consume ball tracking — the court heatmap (where the bounces land) and shot detection (which uses the trajectory shape as a feature) — are calibrated to the same accuracy band.

Walkthrough: one serve, end-to-end

You hit a flat first serve down the T. Here's what ball tracking is doing:

  1. Frames -90 to -10 (relative to contact): TrackNet sees you and the ball in your hand during the toss. Detection per frame.
  2. Frames -10 to -1: Toss apex, then contact. The ball's velocity vector inverts at apex (z-direction). At contact, it accelerates sharply forward.
  3. Frames 0 to ~5: The ball is moving very fast (the ball moves more than its own width per frame at 110+ mph at 30 fps). Detection rate drops; trajectory is reconstructed by fitting a parabola to the surrounding frames.
  4. Frame ~6 (bounce): The ball hits the court. The z-coordinate of the trajectory crosses zero. This is the bounce point that goes on the heatmap.
  5. Speed calculation: Distance from contact-point to bounce-point, divided by elapsed time, scaled by the court-keypoint calibration. That's your serve speed. The error band is reported in the long-form report.

The whole thing runs once per shot and feeds the report. You see "first serve, 92 mph, T-zone, deep" — and that's a synthesis of every layer of the pipeline working together.

What it doesn't do

Important to be clear:

  • Not line-calling. We don't tell you whether your serve was in or out. Hawk-Eye does that with multi-camera triangulation and we don't pretend to match it. Don't argue line calls with your hitting partner using AceSense.
  • Not spin estimation from a phone camera, today. Spin requires either much higher frame rate than a phone records or a multi-camera setup. We mention spin in our marketing only as a feature on the roadmap, not a current capability — we don't want to be the app that ships an unreliable number and then deals with the trust fallout.
  • Not your fault when the ball isn't visible. If you stand in front of the camera between bounces, ball tracking can't see what it can't see.

Pricing

Ball tracking is on every tier including free. The accuracy is the same on every tier — there's no "premium ball tracking" and there won't be. Free tier limits how many matches per month, not what features you get. Full breakdown at /pricing.


Ready to see it on your own video? Upload a match free and look at the trajectory overlays in the report. Or read the methodology page first if you want to know the error bands before you trust the speed numbers. Ball tracking feeds the court heatmap and shot detection; the three together are the foundation of the report.

Frequently asked questions

Is this Hawk-Eye?
No. Hawk-Eye uses 6–10 calibrated stadium cameras and gives sub-millimetre line-call accuracy. AceSense uses one phone camera and gives club-level useful trajectory data. Different tool, different price, different use case. We're not trying to call lines at Wimbledon — we're trying to give NTRP 3.0–4.5 players a heatmap they can act on.
How fast can the model track?
TrackNet processes at native video frame rate during analysis. Most phones record at 30 or 60 fps; the ball is a 5–15 pixel object moving across the frame, which TrackNet was specifically designed for. Modern phone cameras at 1080p are well above the resolution threshold the model needs.
Does it estimate ball speed?
Yes — but with caveats. Ball speed is computed from the trajectory and the court-keypoint scale. It's reliable to within roughly 5–10% on serves and groundstrokes filmed from behind the baseline at recommended height. We don't claim sub-mph accuracy and we don't market a giant 'YOUR FASTEST SERVE' number — that's the framing competitors use that produces the [Reddit complaints](https://www.reddit.com/r/10s/comments/xc2xc0/) about implausible speed numbers. See /accuracy for our full speed methodology.
Does it work on clay courts?
Yes, with one caveat: heavily-worn clay courts where the lines are partially erased can degrade the court-detection step that ball tracking depends on for spatial calibration. The ball itself is detectable on any colour court — yellow ball on red clay is high contrast for the model. The bottleneck is the court keypoints, not the ball.
Why TrackNet specifically?
TrackNet is the open-source approach the field has converged on for small-fast-object detection in racket sports. It's purpose-built for what we need — tracking a tiny, fast-moving ball through partial occlusion. Building a from-scratch ball tracker would have been worse and more expensive. We adapted the approach with our own training data and made it work on phone-quality video; the heritage is TrackNet, the implementation is ours.