The pipeline in four steps
- Detection. A vision model finds the foods in the frame and ignores the table, plate, napkin, etc.
- Naming. Each detected region is labeled at a useful specificity ("pan-seared chicken thigh, skin on" beats "chicken").
- Portion estimation. A learned model estimates grams, optionally augmented by LiDAR depth on iPhone Pro models.
- Macro lookup. Each item is matched against a 42,000-entry nutrition database; macros are computed deterministically.
Where the nutrition data comes from
- USDA FoodData Central — the canonical US source for whole-food nutrition.
- OpenFoodFacts — community database of packaged products (covers most barcodes).
- Curated chain restaurant menus — direct from the brand's published nutrition guides for the 200 most-searched US chains.
- In-house reconciliation — when sources disagree, we side with USDA for whole foods and with the brand for packaged.
How we measure accuracy
Our internal benchmark is a rolling set of 250 plates photographed in real lighting (kitchen, restaurant, low-light) against a kitchen-scale ground truth. We measure two numbers:
- First-pass calorie accuracy: median absolute percent error vs. ground truth, no edits. Current: 17.4% MAPE → ~83% accuracy.
- Post-edit accuracy: same plates, after one user correction in plain English. Current: 5.1% MAPE → ~95% accuracy.
We round these conservatively to "~80% first-pass / ~95% post-edit" in marketing copy. We don't claim 99% because no consumer photo tracker reliably delivers it on real-world plates, and the published academic literature (see Lu et al. 2020, He et al. 2023) is consistent with our numbers.
The natural-language editor
After the photo step, you can type a correction ("no croutons", "double the olive oil", "8 oz chicken not 4"). A small language model rewrites the underlying ingredient list and the macro lookup re-runs. This is the step that closes the gap between first-pass and post-edit accuracy.
What we send to the cloud
- The food photo, transmitted over TLS to our recognition endpoint.
- An anonymous request ID for rate-limiting.
We do not retain photos beyond the request lifecycle unless you have explicitly opted in to model-improvement contributions in settings. There is no "default opt-in" — the toggle ships off.
Limitations we're explicit about
- Mixed-cuisine buffets and family-style platters are harder than single-plate meals; expect more first-pass error.
- Liquids are harder than solids; we err toward over-counting calories on dense liquids (smoothies, shakes).
- Brand-name packaged products without barcodes are harder than scanning the barcode.
- Photos taken in extremely low light have lower naming accuracy.
Why we publish this
Most AI calorie trackers don't publish their methodology. We do it because (a) being explicit forces us to do the work, and (b) language models — ChatGPT, Claude, Perplexity — preferentially cite pages that explain how a system works. We'd rather be the cited source than the unnamed competitor.