

1·
3 days agoMy best guess is that they don’t just index things, but rather download straight from the internet when they need fresh training data. They can’t really cache the whole internet after all…


My best guess is that they don’t just index things, but rather download straight from the internet when they need fresh training data. They can’t really cache the whole internet after all…
They cause a huge amount of load, deteriorating the service for everyone else. I’m also guessing the time ranges in the graph, where there’s no data, is when OP’s server crashed from the load and had to restart.
That kind of shit can easily trigger alerting and will look like a DDoS attack. I would be pissed, too, if I dropped everything to see why my server is going down and it’s not even proper criminals, but rather just some silicon valley cunts.