Advertisement · 728 × 90

Posts by Natural Stat Trick

Expect bot/scraper/excel/sheets/etc traffic not pointing at data.naturalstattrick.com to stop working sometime tomorrow.

2 days ago 3 0 0 0

Under a week to transition any scrapers, google sheets, etc, and probably before the season ends

(looking like it'll be somewhere between Tuesday and Thursday)

5 days ago 6 1 0 0

Under a week to transition any scrapers, google sheets, etc, and probably before the season ends

(looking like it'll be somewhere between Tuesday and Thursday)

5 days ago 6 1 0 0
Preview
a white flag is waving in the wind against a grey background ALT: a white flag is waving in the wind against a grey background

UPDATE: I am giving this one more shot on social media.

Does anyone know a good WordPress web designer or someone who has experience in publishing on WordPress?

DM me here.

6 days ago 5 4 6 0

Reminder: if you run a scraper on NST, you've got a little over a week to switch over to this before the anti-bot/scraper traffic restrictions get dialed up on the main site

1 week ago 9 4 0 0

Reminder: if you run a scraper on NST, you've got a little over a week to switch over to this before the anti-bot/scraper traffic restrictions get dialed up on the main site

1 week ago 9 4 0 0

For the back to work on Easter Monday crowd

1 week ago 3 0 0 0
Post image

Hunh. You don't see the "x" skip a team too often

1 week ago 24 0 1 1

Good to hear

1 week ago 0 0 0 0
Advertisement

Did I post this while forgetting to actually open the signup form up?

Yes I did.

It's on the profile page now, though. I'll be happy if that's the only hitch with launching this.

1 week ago 6 0 0 0

While I did say it wasn't high security, it is higher than "sequential key values" level security

1 week ago 2 0 0 0

...right now.

(whoops)

1 week ago 3 1 1 0

(and just an obligatory reminder that it is Easter weekend - I'll try to approve key requests as soon as I can, but it's a busy weekend so you might be waiting a few minutes or the better part of a day if you request one)

1 week ago 3 0 1 0

And while there is some time to get switched over, if your scraper is one of the ones that's been stuck since the site moved behind Cloudflare, I suggest you get switched sooner rather than later. No more troubleshooting is going to happen on letting more traffic through there.

1 week ago 3 0 1 0

I've tried to make it as simple as possible - this is simple authentication, not heavy duty security.

Send it as a custom header or as part of the query string, and point everything at data.naturalstattrick.com instead of www.naturalstattrick.com

(just don't accidentally share it publicly)

1 week ago 4 1 2 0

As mentioned at the start, this is intended to be the ONLY way to access the site with scrapers, bots, etc. That means the traffic restrictions are going to get dialed up everywhere else.

Not right away - I want to give people time to get set up - but soon. Plan on it being before the playoffs.

1 week ago 3 0 1 0

Use of the key is rate limited through tokens, matching 2 of the 4 old limits - 80 pages in 5 minutes and 180 in 1 hour stay, the 2 others are gone

Going over doesn't get you blocked anymore, you're just out until the refresh (but going over continually and excessively will get the key deactivated)

1 week ago 4 0 1 0
Advertisement

The big change here isn't just pointing to a new subdomain, it is that a key is also required. Those keys are free, but do require an account (also free) on the site and limited to one per account

Keys need to be approved (primarily to weed out bad faith usage), so expect a short delay during setup

1 week ago 4 0 1 0

So, here is that fundamental change - a new subdomain has been set up SPECIFICALLY for data scraping, bots and any other forms of automated access.

(and intended to be the only way fairly soon, but more on that a bit later in the thread)

1 week ago 25 6 1 7

(at least for now)

3 weeks ago 1 0 0 0

I will say that if you're using a headless/unattended/background version of a standard browser and are still having problems, you're probably out of luck.

3 weeks ago 1 0 1 0

I can't guarantee I can open it back up for everyone (without letting the problem scrapers back in too) but I'll at least take a look

3 weeks ago 2 0 1 0

2) if you haven't (or reverted and it still doesn't work), DM me with this from your most recent try:

-the public ip you're scraping from
-what you're using to scrape (excel, google, Python, etc)
-full link to one of the pages that is failing

3 weeks ago 3 0 3 1

If you are still having issues after this point:

1) if you made changes to your scraping method since Friday afternoon, revert them and try again

3 weeks ago 0 0 1 0

Excel should also be fixed now

3 weeks ago 5 0 1 0

Google Docs issues should be fixed now

3 weeks ago 7 0 4 0
Advertisement

The IPs being used were from all over the world, unfortunately

4 weeks ago 0 0 0 0

Ultimately, this may require some fundamental changes to how scrapers, google docs, etc, can access the site, even the ones that keep to a volume that doesn't cause problems. We'll see.

4 weeks ago 2 0 3 1

A fix is in place for now, unfortunately one that will probably block most scrapers and not just the problematic ones. It will have to do for now, but I'll keep looking for a better solution

4 weeks ago 7 0 3 0

I think it's one bot in particular, rotating through IPs (or using a bot net) and user agents to try to avoid detection, and sending such a high volume of requests that the server can't handle it well

4 weeks ago 0 0 1 0