On Wednesday someone tried to scrape Snooth . Now, this happens all the time - Google crawls our site daily, downloading hundreds of thousands of pages each day for example. Hundreds of other search engines do it too. However, this traffic was coming from a source we didn't recognize, it was also coming in way faster than anything we'd experienced to date.
After some digging it turned out to be an anonymizer service that was using over 1,000 different servers to pound our site - this slowed down our page load times for legitimate users. We quickly found an innovative way to block them, and our site speed returned to normal.
Snooth is the world's largest wine database - the site has the most wines, reviews, images, winemakers notes and wine related content in the world, so I understand why other sites would want access to the content. However, we offer an API that makes it easier than crawling/scraping us.
The API is a better way to get at our data - its a dedicated set of servers, so you're not fighting over resources with our users. It'll also stay current. Instead of scraping the data and seeing it go out of date in a month, you can use the API to collect images, reviews, notes and wine information. Best of all is the prices - by using the API you can keep the inventory items and prices up to date.
We launched the API in November and so far over 50 companies have launched or are building applications that run off the Snooth API. In addition to our 500,000 strong monthly audience the extended network of users that we reach via these API partnerships brings the total audience to over 2,000,000 users per month.
So, next time you are thinking you want some wine data, please go take a look at our API instead of scraping us. It'll give you better data, and we won't be trying to throttle your access to protect our users.
Open letter to web hackers
- Blog comment by Hello Vino, Feb 27, 2009.
HelloVino.com uses the Snooth API to deliver wine pairing suggestions, and also allows user to submit their own notes and ratings. I agree with Philip, using the Snooth API is far more efficient (and much cooler) than scraping Snooth.com.
On a similar note, Philip - I sent a message to email@example.com regarding some API XML results. What's the typical turnaround time on responses to those inquiries sent to that address?
Thanks again for offering up Snooth's amazing amount of data.
Rick from Hello Vino
- Reply by Mark Angelillo, Mar 2, 2009.
Rick -- We'll be back to you re: your inquiry shortly. Glad you're finding the API to be helpful!