How I actually test these apps

Okay, deep breath, here’s how we actually test these apps.

We try to keep this simple and honest, because we get a little allergic to “comprehensive 47-factor proprietary scoring framework” energy. The truth is, the scoring matters less than the testing does. So:

1. We pay for our own subscriptions

Every single one. Out of our own pockets. We’ve never accepted a complimentary subscription, a press code, or any other form of free access. We do this because the moment you accept free access, you become a guest of the platform, and you can’t really write honestly about a place you’re being hosted by.

2. We test for at least 8 hours, per editor, per platform

That’s two editors × 8 hours × however many platforms we’re covering this cycle. Sometimes much more. We don’t write about an app we haven’t lived with for at least a week.

Editor A goes deep, one persona, long sessions, sustained conversation, week-over-week. We’re looking at whether the app holds together over time, whether the persona drifts, whether the writing keeps up.

Editor B goes wide, rotating through five different personas, testing breadth. Different ages, different vibes, different conversation styles. Looking at how the app handles range.

3. We pay attention to how it felt

Then we sit down, separately, and write down what we noticed. Not just bullet points of features. Actually what felt good, what was awkward, what made us close the app, what made us come back. The feeling matters because it’s what readers actually experience.

4. We talk about it

Then the two of us get on a call and talk through what we found. Sometimes we agree, sometimes we don’t, and the disagreements are usually the most interesting parts. We work through them out loud, in writing, and the result is what becomes the review.

5. We sit on it for a couple days

Before we hit publish we leave the draft alone for at least 48 hours. Then we re-read it. If it still feels true, we publish. If something feels off, we go back.

What we score

We give each platform a single overall vibe-score out of 10. Not because nuance doesn’t matter, it really does, and we talk about nuance in the review, but because a single number is the most honest way to tell someone “is this thing worth your time?” We think any reviewer offering you 47-factor breakdowns is mostly trying to seem authoritative.

The score is a vibe. It reflects: did the app deliver on what it promised, did the writing hold up, did the visuals (if any) make sense, did pricing feel fair, was it easy to leave when we were done.

What we don’t do

Accept review copies, free trials beyond what’s publicly available, or any platform-provided access.
Pre-publication outreach. We don’t email platforms to tell them we’re reviewing them. We don’t accept their outreach either.
Affiliate-weighted scoring. The score is locked before any affiliate decision is made.
Re-test on request from platforms. If you’re a platform and you’ve changed something significant, you’ll be re-tested in our next cycle, like everyone else.

And that’s it. It’s deliberately a slow, boring methodology. It’s also the only one we could come up with that felt honest. 🫶