One extension, beyond stack: market category/domain/application - or any combo that tells me what the product does.
Fab project otherwise!
[flagged]
Nice one.
I've been doing the same bit wider scope, for the whole Crux list, pruned to apex domains, and looking for CMS signals - how's your throughput?
I'm not doing any headless browser stuff, or many requests, so hyper optimised for speed.
I do grab robots.txt - didn't really see much in llms.txt or humans.txt in the wild, does yours?
Ohh Cloudflare verified bot status, interesting I'll check that out.
I'm seeing about 6.6% block rate, but that does climb over time.
[dead]