How we test and grade
We’re not here to be “the best tools” blog. We publish failure-first receipts so you can buy the cheapest tool that won’t embarrass you. This page explains how we run tests and what our grades mean.
What we test (the failure modes)
The expensive failures are predictable. Tools don’t “kind of” fail — they fail confidently where it hurts: names, numbers, overlap, and noise.
Inputs: short clips + transparent “gold answers”
For v1.1, we use short, repeatable clips (typically 60–120 seconds) and publish the “gold answers” we score against. This keeps the tests reproducible and lets vendors challenge results.
- Clips are embedded and attributed (title, creator, source link, license).
- Gold answers are shown (no mystery scoring).
- We prefer repeatable scenarios over “impressions.”
What a “grade” means
Grades are a summary signal — not a promise. A strong grade can still be the wrong pick if it violates your constraints (bot policy, privacy posture, platform, integrations).
- Pass/fail stamps flag known failure modes.
- Lab summaries tell you where to be cautious.
- Pricing table routes you to the cheapest option that fits.
Pricing verification cadence
Pricing changes constantly. We aim for a weekly verification cadence and publish a “report an update” link on the pricing table.
- We verify the “starting price” and whether a free plan exists.
- We prioritize buyer-impacting changes (tier reshuffles, limits, retention changes).
- We link directly to vendor pages where possible.
Corrections and vendor feedback
Tools change. If you believe something is materially inaccurate, we want the receipts. Send a correction request with the URL, what’s wrong, and supporting evidence.