Mobile app version of vmapp.org

Tag: WebCrawlers

Sorted by: Newest Newest Oldest

YK1175434

: Web crawling and ethics/legality Is it illegal or unethical if i compare prices on my website and don't provide a link to that website but instead go there myself and deliver it to the customer?

Alves908

: Facebook crawler with no user agent spamming our site in possible DoS attack Crawlers registered to Facebook (ipv6 ending in :face:b00c::1) were slamming our site, seeing 10s of thousands of hits

Chiappetta492

: Is there a way to get Google search console crawl stats for larger than 90 days? Google search console allows you to see how many pages Google is crawling per day on your site, but it only

Hamaas447

: What is `/&wd=test` URL that is being requested from my site, probably by bots I'm seeing error logs on a website because something tried to access: example.com/&wd=test the HTTP_REFERER

Steve110

: How to force Google and other bots to pick actual images and not thumbnails? For example, if there is an online shopping websites with thousands of small thumbnails of products and when you

Kaufman445

: Pagination and crawl depths On a website with ~40 blogs, we have recently switched on pagination meanings blogs are on page 1-8. With google crawl being less likely to crawl over 3 clicks deep,

Angie530

: Do soft 404 errors on wiki sites caused by pages not yet created cause SEO problems? I host a couple of wiki based sites, so there is a lot of content at various stages of generation, and

BetL925

: 404 or 302 Redirect - what to use for a url which may be used in the future but not available at the moment My site lists blogs like this example.com/?status=blog&id=number I only have

Kevin317

: Will a dynamic robots.txt file that disallows crawling based on the time of day hurt SEO? We have a serious traffic issue on our site and we want to eliminate crawlers as part of the problem.

Candy875

: Strange usage pattern: UK user accessing 1 page X times every 14hrs – who/what can it be? For about two week I find a strange pattern in my web statistics: approximately every 14 hours a

Reiling115

: Website Home Page URL Not Show On Google Search If i search like site:www.paka.tv my main homepage URL not showing, but categories of my site pages showing on results. How to show my website

Cofer257

: Why am I getting bot hits from compute-1.amazonaws.com? I have a WordPress website with AWS on use, namely the Cloudfront service, to serve CSS, images and JS from the cloud. Lately, I noticed

Kevin317

: Webmaster tools 'URL errors' always shows errors The issue I am facing is my webmaster tools always show an error message.The error URL's are not related to my website.The site url mentioned

Yeniel560

: Will creating an app that user prefer to the website reduce search engine rankings due to lower usage of the site? If all users use only iPhone or Android app instead of use the website,

Chiappetta492

: How to find out the referrer of Googlebot's crawling URL? Googlebot crawls 100s of 404 URLs from my website. I want to know from where it gets those links? Is there anything like HTTP Referrer?

Carla537

: Robots.txt isn't preventing my site from being crawled I'm having problem with robots.txt. I put the robots.txt file in the website main directory (and also in /var/www/html - to make it work

Voss4911412

: Keep order status private from search engines? I searched hard before posting this question. I apologize if it is a duplicate or if this is not the correct forum. We have a homegrown shopping

Angela700

: Home page has a youtube video on it, unsure how to fix google's description A website i'm working on has embedded on it's homepage a youtube video. For some reason, google is ignoring my meta

Martha676

: Can we have Google crawl but ignore our paginated category pages and prefer our individual post pages in the index? We have a site with a massive content back-end (50.000+) with, lets say,

Welton855

: How frequently should I expect Google AdsBot to visit my site? I bought paid advertising for my website through an internet advertising company. I noticed that, among all the visits coming from

Nimeshi995

: Htaccess empty referer deny "google bot" I put this rule into htaccess file to deny empty referers which is returning 403 SetEnvIfNoCase Referer "^$" bad_user Deny from env=bad_user I

Jennifer507

: Confused with google url indexing When enter my site in google like this site:mywebsite.com I have seen this below image. I have 169 total urls in my sitemap.xml file. It says 102 results

Hamaas447

: Seo for hidden anchor tags I have a very simple doubt, does google seo or other seos recursively crawl hidden hyperlink tags. I googled but could not find any solution. Any experience or any

Angela700

: How to stop indexing/crawling for Shop Checkout Summaries? I have a small shop checkout that uses cookies for my cart and after the payment is done it generates a unique order-id creates an

Carla537

: Unverified and verified Bing crawlers from same subnet I have some traffic from Bing crawlers I'm trying to verify. I'm using the method Bing suggests, namely reverse and then forward DNS lookup,

Sent6035632

: What's the SEO effect of copying the same content to another blog? I am writing on my own blog, now I would like to copy the same content to another blog without deleting the old one. Could

Hamm4606531

: How to prevent CDN content URLs being indexed by Google Well, robots.txt prevent crawling and meta robots tag in HTML (or) X-Robots-Tag HTTP header prevents indexing (and other functionalities

Steve110

: Does Google or Bing follow dynamic HATEOS links that are not in an 'a' tag? I have a Angular 1x SPA that uses the HATEOS standard to manage navigation. This is further complication by

Holmes151

: What are these many requests referred from Facebook with a changing s= parameter? I just recognised a peak in our access logs and I'm curious how to explain this or if my suggestion how to

Shelton105

: How to prevent Googlebot from doing API requests? I have a currency converter site around 32k pages. Every pair for each page. And every page has 2 API requests. I started to see huge number

Phylliss660

: SEO impact of ajax loaded content from external site I am in need of some expert opinions before I put a lot of time into a business decision. I have a website where I write product reviews

Shelton105

: How do you add a rule just for a specific bot to robots.txt? I have a small website, for which the current robots.txt looks like so: User-agent: * Disallow: Sitemap: https://www.myawesomesite.com/sitemap.xml

Angela700

: How to block everything from being indexed except sitemap.xml I want to block everything and index sitemap.xml file alone. So I do it as shown below: User-agent: * Disallow: / Allow: /sitemap.xml

Mendez628

: Wordpress Website not indexing in google, no error on google webmaster I just recognized my SEO Score is 0 and that my wordpress website is not indexed on google search engine, no automatic

Alves908

: When does a search engine learn about a particular user inside a website and display the profile link and /or a Login link along with search results? How does Google search engine display user

Ravi8258870

: Is the User-Agent "gce-spider" a well known scammer, a bad bot? My website has been scammed using some "scamming-web-site steals my content through a proxy and serves the stolen content from

Radia820

: Prevent Googlebot from crawling "access denied" error (403) of private forum that are reported in Google Search Console? I'm running a website base it by vbulletin. Recently I moved one of my

Annie201

: Moving a site from one subdomain to another subdomain, what to do with pages which aren't "mappable"? In the process of moving a site from one subdomain to another subdomain. For a lot of

BetL925

: How can I stop Google from indexing "pretty links" external redirects from my WordPress site? When I search site:[example].com in Google for my blog, the majority of the pages that are being

Annie201

: Can a user or a crawler see the source of a page that has been redirected via a 301? Is it possible for a user or a web-crawler to see the contents/source code of a webdocument that is

Reiling115

: Preventing/blocking from crawling a specific user control of a page Currently, google access/crawls a user control from the page given below- http://articles.mercola.com/sites/articles/archive/2017/07/20/do-fidget-spinners-help-anxiety

Karen161

: Repeated hits on my site from different IP addresses trying to access .aspx files using all my bandwidth I checked my raw access files after being notified that my site has been limited over

Rivera981

: In a robots.txt file is a Noindex: command recognised? I have come across a website and it the robots.txt file it has the following information User-agent: * Noindex: /search Disallow: /search Sitemap:

Ann8826881

: What % can be regarded as "normal" "not viewed traffic" reported in AWStats? Recently statistics for my site is about 50% "not viewed traffic". This seems much as many robots are blocked. What

Reiling115

: How can I avoid site search page duplicate title tag error in pagination of site search? Google Search Console shows me my site search page duplicate Title Tag,

Shanna517

: Googlebot submitting thousands of requests to our map locater and using up the API quota We have a store locater page on our customer's site. The end user enters their postcode and a search

Nimeshi995

: ASP.NET Seo and web crawlers So I am going to build a website in Asp.NET and have a few questions. I am planning on using technologies like React and EF. My worry is SEO though, I am wondering

Gail5422790

: Google is adding "Archive" to the title of tag and category pages and not using the meta description I am wondering why Google is indexing my tags and categories as archives. Is that OK to

Martha676

: Should I prevent search engines from indexing empty user profile pages? If so, how much content is enough for indexing? I'm developing a social website for book readers, with public user profile

Cooney921

: Would an online queueing system for a website have an effect on google's ability to crawl the site? If I was to implement a visitor queuing system for my website, where if visitor count is

Michele947

: How long before Google indexes a new (to me) domain that previously had spam and virus problems? I registered a new ".com" domain, but when I added it to the Google Search Console I saw that

Sims2060225

: Is it bad for SEO to have a URL with nothing on it? I have a WordPress site and for a very complicated reason I have to set some post type's template to empty file. Nothing. Which means

Hamaas447

: How to Crawl a website requires cookies for audit? Situation: My Client's website requires cookies to access it. Users should choose (Language and country) to access the website. The problem is:

Moriarity557

: Can we find all backlinks to a webpage? In Page and Brin's paper on PageRank, they say that while you are guaranteed to be able to find all links that point away from a page, the reverse

Pope3001725

: If .htaccess is used to block my bot from accessing a particular directory, will I know this? I'm working on a research project and I have a question. Say I would like to crawl all pages

Caterina187

: Why would a bot submit a sign up form with fake info? I have a sign up form where people can enter their name and address, then click on a submit button. However I am getting BOTs entering

Cody1181609

: How do we know, what is the source from which BING search indexed my webpage? - A Blog or website where my website link is placed One of my web page with sensitive information was indexed

Barnes591

: Where can I redirect exploit scanning bots? I get a lot of those exploit scanning bots like the ones looking for a WordPress login (which I don't have) or guessing other exploitable URLs.

Shelley277

: Facebot sometimes fails to parse open graph meta data, causing share failure We maintain a WordPress-based website where Yoast plugin takes care of Open Graph meta tags generation. Recently we

Gonzalez347

: Why wouldn't a website apear in Google search results after robots.txt update to clean up hacked site? I have a WordPress website which was hacked couple of days ago. I have tried to add

Michele947

: Re-Indexing Home Page - HTML/DOM Change (Site on WordPress) Currently, our company homepage is using images instead of divs to display our main products/solutions. As the SEO, I wanted to remove

Sherry384

: Are user agent names case-sensitive in robots.txt? I'm blocking various bots in robots.txt and I was wondering if their names are case-sensitive. For example: User-agent: grapeshot Disallow: /

Cody1181609

: Could a custom crawler find unlisted web pages? Example: A website has no sitemap.xml, no robots.txt, no index of those pages. Pages are not blocked, bots and humans have access, but they

Eichhorn148

: Rule out third party scraping, but allow Google crawling How to make scraping of own content through wget, httrack etc. impossible, but allow crawling through googlebot? This should be done without

Turnbaugh106

: A top directory in a URL path returns a 404 error, will its subpages still be crawled? I found a site in which one of the directories in the URL path returns 404 error, but the subpages

Reiling115

: Google unable to Crawl and Fetch my Website I created my website www.tribologyconference.com on April 09, 2017 and on the same day I submitted this url (with www and without) to google and

Nimeshi995

: How do spambots find and submit to email opt-in the end of a funnel? I have an opt-in page that's being nuked right now with spambots. A simple webpage that I use to allow subscribers into

Jennifer507

: How should I protect "secret" links I send in Emails from being indexed by search engines? I have noticed that bing/msnbot tries to index pages that were only ever linked to in one single

Correia994

: Should article preview pages be crawled and indexed by search engines? I have a page called "all articles" that loads previews of articles using AJAX. Because of that, the content won't be

Nimeshi995

: Does adding CDN stops Google from crawling? I want to add Cloudflare free CDN to my site but worried about whether Google bots will be able to crawl my site later on or not.

Kimberly868

: How block only Yandex bot Can you show me how will look robots.txt when I block only yandex bot, allow Google bot and block Yandex bot.

Cofer257

: How does robots.txt work with sites in subfolders? I have a single web host with a number of other parked domains/sites in sub-directories, like this: example.com is the primary site and root

Rambettina238

: Ajax Content from Blocked Resource I have a site built in AngularJS. Most of the dynamically-loaded content comes from a Wordpress back-end that is separate from the AngularJS site. In fact,

Marchetta884

: SEO - Pages are blocked because google failed to get the resource since it is blocked by robots.txt I would like to index all the pages in my angular site by Google. I used ngMeta in my

Gretchen104

: How do you properly SEO-tag an app that is only a catalogue of images? I am making a little helper app that mostly links a lot of graphs and images. There is also some text explaining a

Mendez628

: Is it good to submit our site in website submission sites? I was wondering why don't we submit our site in website submission sites and increase our site's rank. Is is good to do this? How

Moriarity557

: Set bot crawl delay 10 seconds EXCEPT Googlebot? Is it possible set the crawl delay for all bots to 10 seconds except Googlebot and Bing/Yahoo which can proceed at any pace? I like being indexed

Bryan171

: Checking site for meta refresh redirects How do you check or crawl site (couple of urls) to explore existing meta refresh redirects? Screaming Frog doesn't handle them - it indicates a page

Harper822

: How to check whether the infinite page scroll data is crawled and indexed by google or not? I am having a music platform website where at a time 20 songs are visible right now then after

Martha676

: How can search engines find my RSS feed? I have a small blog. Now, I have created an rss.xml. I put that in the root of the site, on the server. Should I do anything to make search engines

Si4351233

: Google Search Console: 404 errors on existing pages There is a few years old small website with very few pages (~5), which were indexed and ranked by Google. A few days ago 4 of those pages

Sent6035632

: Should i create seperate page for every ad for search engine indexing purposes in my website I am going to create a simple ad marketing web site. so I should use databases as my storing

Rambettina238

: Which one is more SEO friendly Dynamic pages or static pages for blogging website I am new to web development and I am developing a Blogging website. While working on its architecture from

Vandalay111

: "Is robots.txt blocking important pages?" in Search Console regarding static folder This is the contents of my robots.txt file: User-agent: * Disallow: /static/* Disallow: /templates/* Disallow: /translations/*

Connie744

: Does a link rel canonical tag pointing back to the page itself cause an infinite loop that wastes crawl budget? I assume that search engine robots crawl the whole page and then the canonical

Reiling115

: Robots.txt for website hosted in a subdirectory I have 2 website which is hosted in a shared hosting. 1st website example.com hosted in a root directory as /public_html/ 2nd website example2.com

Moriarity557

: How is noFollow enforced on sites like Quora and Facebook? I'm curious to know how search engines like Google enforce their noFollow policy on social sites. It seems like it would be largely

XinRu657

: Block Yandex crawler Our site has been behaving very strangely for the last few days, lots of time outs etc. Finally think I found the cause, the Yandex bot is crawling around 10,000 pages

Shanna517

: Crawler says, page not found, but browser says otherwise I'm stumped. Google has been reporting increasing numbers of 404 errors for my website. But my site is static. Ok. Maybe they

Rivera981

: I need the subdomain in cpanel to not be followed. Help! I have little knowledge of robot.txt but I know how meta follow and no follow tags work. The problem is I have 2 totally different

Murray432

: Spider-Trap on a GitHub Site I have a GitHub site and I hate web-crawlers that disobey or ignore robots.txt. How would I set up a Spider-Trap on a GitHub site that the robots.txt disallows

Sims2060225

: How does a webmaster perform a proper redirect from one domain to another? There are a few questions that are inherently present inside the main question. Those questions are: What is the

Harper822

: Is it possible to block search engine indexing using DNS alone? As per the title, is it possible to block all search engine indexing using DNS? Most guides point towards robots.txt or meta

Deb1703797

: My personal website was cloned in its entirety, is this a security concern? I was googling my (real) name yesterday (which is the name of my site) on a lark and discovered a website that

XinRu657

: Googlebot not respecting HTTP basic auth I have basic auth set up and it has always worked. Suddenly Google started crawling my pages. The auth is still there (I have checked it using different

Speyer207

: Do I need an umbrella services page to match the /services/ part of our URLs? From a usability perspective, this is not necessary. The mega menu dropdown on all pages makes it extremely clear

Sherry384

: SEO issues with Elm? Do search engines, particularly Google, render JavaScript created from transpiled Elm code when crawling? Can they follow links, even internal that modify the existing page?

Gonzalez347

: How does Google treat underscores in site map URLs? Google is currently reporting that my URLs are invalid within my sitemap. Here's an example of a document that was considered erroneous by

XinRu657

: Stopping Google from crawling my static domain I use a cookieless sub domain static.example.com to serve all images, js, and css files. This static sub domain has as its root directory the same

Sue5673885

: Google Analytics traffic surge from China, not real visitor, Baidu? UPDATE2: Seems this all comes from Baidu crawlers, here's some outputcreated by GoAccess analyzing our logs. We've restricted

Next Page

Terms of Use Create Support ticket Your support tickets Stock Market News! © vmapp.org2025 All Rights reserved.

Back to top | Use Dark Theme