Mobile app version of vmapp.org
Login or Join
Welton855

: Techniques to prevent apps hijacking my search engine? I'm working on a site with a large music content database, and recently some app developer looking for a bit of a reputation boost has

@Welton855

Posted in: #Django #ScraperSites #SiteSearch

I'm working on a site with a large music content database, and recently some app developer looking for a bit of a reputation boost has launched an app that completely piggybacks on our database: a user will search for a track in the app, the app will send the request to our search page and scrape the results, returning the top result to the user. Here are some things I've tried:


Returning 444 for the app's user agent (but they changed it to a legit browser string).
Detecting a referrer on the search page, and returning 404 if no referrer is found (but it would be easy enough for them to spoof a referrer).


One idea I thought of is some sort of token sent along with the search query any page with the search box, and the search result validates this token and, if invalid, returns 404. Are there any tried-and-true techniques that do this kind of thing? Or anything else I can do to prevent this data thief? He's completely unapologetically stealing our data and crippling our site!

By the way I'm using Django, in case there's anything built in to there that could help me.

10.01% popularity Vote Up Vote Down


Login to follow query

More posts by @Welton855

1 Comments

Sorted by latest first Latest Oldest Best

 

@BetL925

Your token idea would work. You could change it periodically to make it harder.

You could also change the parameters on your search form. Start using s= instead of q= for the search term. That would force this developer to keep up with those changes.

You could also implement Captcha and not show search results unless the user proves they are a human. Google just launched a version of recaptcha that is just a checkbox: googleonlinesecurity.blogspot.hu/2014/12/are-you-robot-introducing-no-captcha.html

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme