Tag: RobotsTxt
Sorted by: Newest Newest Oldest

: Do robots.txt and sitemap.xml need to be physical files? I have both setup in my routes: Route::get('/robots.txt', function() { // robots.txt contents here }); Route::get('/sitemap.xml', function()

: Disallow JS scripts in robots.txt for Googlebot To my knowledge, Googlebot is currently fully capable of rendering complex SPA applications, and this is recommended as a rule of thumb. The website

: Will Google penalize me for loading images that it can't see? (blocked by robots.txt) I have a script setup to load images for the visitor on another server. The robots.txt file on that server

: Do you need to add 301 redirect rules into robots.txt for search engines when you specify them in .htaccess? Do I need to add 301 redirects to the robots.txt file? I will be adding the redirects

: Does a "Disallow:" rule with nothing following it in robots.txt block my entire domain? If my website's robots.txt has the following code, does that mean my entire site is being blocked from

: Disallow cache folder of website using robots.txt If I will Disallow cache folder of website using robots.txt. So I will bad effect ranking of website or not.

: How can I prevent bad bots from clicking on my advertisers' ads? I have a website where I create ads for clients. They are not Adwords or similar, they are custom ads I create. All ad links

: Clean up hacked site by getting Google to crawl and index only the URLs in the sitemap So recently, our website has been hacked and we're trying to clean everything right now. But, when doing

: Allow a folder and disallow all sub folders in robots.txt I would like to allow folder /news/ and disallow all the sub folders under /news/ e.g. /news/abc/, /news/123/. How can I do that please?

: Will a dynamic robots.txt file that disallows crawling based on the time of day hurt SEO? We have a serious traffic issue on our site and we want to eliminate crawlers as part of the problem.

: Use robots.txt to allow Google to crawl images but not index them I have problem with on page optimization. I checked my site to see if it is compatibility for mobile devices with Google's

: Is it good practice to block crawling of a website's privacy policy with robots.txt? I'm not sure whether I should add my website's privacy policy to my robots.txt file or not. I want to follow

: Robots.txt isn't preventing my site from being crawled I'm having problem with robots.txt. I put the robots.txt file in the website main directory (and also in /var/www/html - to make it work

: Search Console - lock in robots.txt In search console when I use "see like Google" I got this: resources not on my site are blocked by robots. How to fix this?

: How to disallow hash fragments in robots.txt file I have a similar URL in my website that I want to stop all robots from crawling /review/dir/dir/dir/dir/#review-form The rules I've tried are:

: Site's description not showing for robots.txt? When I search on Google through WordPress site domain name without TLD, it shows the site's information but, in the Title section it shows only

: "admin-ajax.php " file should allow or disallow in robots.txt file? My Wordpress site c-panel showing admin-ajax.php file wasting more CPU resources. how to solve the problem. Should i use this

: URL is indexed but description not available due to "this site's robots.txt" I have a similar URL indexed by Google https://example.com/companies/company?utm_source=xxx And this output: A description

: How to stop indexing/crawling for Shop Checkout Summaries? I have a small shop checkout that uses cookies for my cart and after the payment is done it generates a unique order-id creates an

: How to prevent CDN content URLs being indexed by Google Well, robots.txt prevent crawling and meta robots tag in HTML (or) X-Robots-Tag HTTP header prevents indexing (and other functionalities

: Does Alexa's crawler need a robots.txt on each sub domain? I have a robots.txt on my www site. Do I also need one on other sites like my app and forums sites? Alexa audit keeps finding bad

: How do you add a rule just for a specific bot to robots.txt? I have a small website, for which the current robots.txt looks like so: User-agent: * Disallow: Sitemap: https://www.myawesomesite.com/sitemap.xml

: URLs with 'NoIndex` in robots.txt are being indexed by Google In my robots.txt file (http://www.tutorvista.com/robots.txt), I'm using Noindex: /content/... to disallow indexing: This should mean

: How to block everything from being indexed except sitemap.xml I want to block everything and index sitemap.xml file alone. So I do it as shown below: User-agent: * Disallow: / Allow: /sitemap.xml

: Wordpress Website not indexing in google, no error on google webmaster I just recognized my SEO Score is 0 and that my wordpress website is not indexed on google search engine, no automatic

: Moving a site from one subdomain to another subdomain, what to do with pages which aren't "mappable"? In the process of moving a site from one subdomain to another subdomain. For a lot of

: Preventing/blocking from crawling a specific user control of a page Currently, google access/crawls a user control from the page given below- http://articles.mercola.com/sites/articles/archive/2017/07/20/do-fidget-spinners-help-anxiety

: Parts of my site are not getting indexed after changing the URLs, adding redirects, and blocking the old URLs in robots.txt I have a website with some URLs and later I optimized the URLs and

: How to create robots.txt and sitemap.xml seo. And to quickly index google My website is very slow for indexing. Maybe you can help me to quickly indexed. Please i ask for help for sitemap

: Prevent indexing of site search results Users have the ability to search on my site. This function renders a search results page that has occasionally been indexed by google and served in SERPs.

: In a robots.txt file is a Noindex: command recognised? I have come across a website and it the robots.txt file it has the following information User-agent: * Noindex: /search Disallow: /search Sitemap:

: Internal Search Results: NOINDEX or robots.txt Blocking? We have our own internal search results pages for our download section. These results are currently crawlable but with NOINDEX meta tags

: How to implement HTTPS redirection for static resources such as robots.txt in legacy ASP.NET website? We switched to HTTPS in our legacy ASP.NET site about a month ago. We had already implemented

: When you move a site via a 301 redirect should you setup a robots.txt disallowing robots to crawl the old address? A site I am working on moved a subdomain to another subdomain via a 301

: 'Sitemap contains urls which are blocked by robots.txt.' Warning - However robots.txt file doesn't appear to be blocking anything Our site is built on WordPress. Whilst in the development stage,

: Remove subdomain & sub folder and Its child pages from Google Search I have try remove my site subdomain & sub folder and Its child pages from Google Search with robots.txt file. but, I

: Should I prevent search engines from indexing empty user profile pages? If so, how much content is enough for indexing? I'm developing a social website for book readers, with public user profile

: Protocol Agnostic Robots Sitemap Recently, I have enabled all my servers to serve everything over HTTP and HTTPS. Users can access any site via http://www.example.com or https://www.example.com.

: How to tell google that domain no longer exists but still make redirects from that domain I have two domains: example.com and dev.example.net. example.com is the active one and dev.example.net

: Deleting a URL in Google Search Console causes a warning that something important has been blocked I have removed a page on my website and put the URL as Disallow in robots.txt and I removed

: Will it effect SEO if there are 30 links on a page where the destination is blocked by robots.txt I have a webpage on which there are around 30 data points which link to 30 different pages

: Removing content in specific directories from Google search results We are using a plugin on our website to handle job application submissions. I found out recently that the applications were

: Redirecting robots.txt when moving to a new domain I'm moving a site to a new domain, to a local tld. I already redirected everything successfully to the new domain, except the robots.txt. My

: Huge tables won't fit on mobile, can I tell Google the page is desktop only? I'm working on a product comparison page that is composed of a giant table listing the attributes of a great many

: "Crawl-delay" rule ignored by googlebot In Google Webmaster Tools this warning showing for site: The yellow signal indicating "rule ignored by googlebot". How can I fix those warning?

: How to disallow bulk urls from robots.txt? I built a dynamic sitemap generator on my platform. By mistake I generated some 300+ wrong url along with the right ones. And they have been there

: Why wouldn't a website apear in Google search results after robots.txt update to clean up hacked site? I have a WordPress website which was hacked couple of days ago. I have tried to add

: Are user agent names case-sensitive in robots.txt? I'm blocking various bots in robots.txt and I was wondering if their names are case-sensitive. For example: User-agent: grapeshot Disallow: /

: How to serve a corresponding robots.txt file for each website in the same directory? I couldn't find the exact answer for this particular issue. I have two add-on domains https://domain001.com

: Fetch as Google, ad network has robots.txt with Disallow all, is it cloacking? we run ads on our site and I've noticed that at the eyes of google the page isn't the same as at those of

: Should article preview pages be crawled and indexed by search engines? I have a page called "all articles" that loads previews of articles using AJAX. Because of that, the content won't be

: Google Blocked Resources showing incorrect data In Google Webmaster Blocked Resources tab it is showing 500+ blocked resource (like .js and .css), but when I tested those URL in robots.txt Tester

: To allow crawling of all but a specific folder, do I need to include an empty disallow directive in robots.txt? If I want my website to be crawled, do I need an "empty" disallow? Is there

: How block only Yandex bot Can you show me how will look robots.txt when I block only yandex bot, allow Google bot and block Yandex bot.

: How does robots.txt work with sites in subfolders? I have a single web host with a number of other parked domains/sites in sub-directories, like this: example.com is the primary site and root

: SEO - Pages are blocked because google failed to get the resource since it is blocked by robots.txt I would like to index all the pages in my angular site by Google. I used ngMeta in my

: Is rel="nofollow" necessary for a sponsered link through a redirect script disallowed by robots.txt? Paid links (e.g. affiliate links) should have the rel="nofollow" attribute added them to prevent

: Set bot crawl delay 10 seconds EXCEPT Googlebot? Is it possible set the crawl delay for all bots to 10 seconds except Googlebot and Bing/Yahoo which can proceed at any pace? I like being indexed

: Use robots.txt to prevent privacy policy, terms and conditions, and guarantees from being crawled and indexed by Google I need to block pages such as privacy policy, terms and conditions, and

: Open ports on a website Alright. I risk sounding like an absolute newbie but I've got to know this. A friend and I paid a developer to develop a website for a start-up we had come up with.

: What's wrong with my robots.txt file to block Google? I have a website and I don't want this website to be visible on Google. So I deployed a robots.txt file. But the homepage of the website

: Google not crawling my site (robots.txt error) I'm currently performing SEO for my client's project. I'm kind of new to this, so please bear with me. I've read a lot mixed reviews on inclusion

: Changing sitemap.xml and robots.txt after moving from http to https I am migrating a website from http to https entirely, all http urls will have 301 redirects to their https counterparts. From

: How to have 2 sitemaps - 1 for main site the other for wordpress in subfolder Background My site runs on a dedicated server and and has Wordpress in a subfolder. I have my mainsite which

: Google Search Console: Blocked by robots (Screenshot) I just discovered that there was a huge decrease in the number of pages that are blocked by robots in my search console. It is an online

: Geographical landing pages not showing up on Google search results Problem we have: One markets' geographical landing pages do not appear on search results. Few months ago we discovered that robots.txt

: Prevent indexing of image or file I understand that you can do the following to generally prevent images or pages from getting indexed. Add to the page's meta section: <meta name="robots"

: Robots.txt proper way to set it up If you want to allow all search engines to crawl a site and only block one specific folder, is this correct? User-agent: * Disallow: /folder_name/ If you

: "Is robots.txt blocking important pages?" in Search Console regarding static folder This is the contents of my robots.txt file: User-agent: * Disallow: /static/* Disallow: /templates/* Disallow: /translations/*

: Robots.txt for website hosted in a subdirectory I have 2 website which is hosted in a shared hosting. 1st website example.com hosted in a root directory as /public_html/ 2nd website example2.com

: I need the subdomain in cpanel to not be followed. Help! I have little knowledge of robot.txt but I know how meta follow and no follow tags work. The problem is I have 2 totally different

: Spider-Trap on a GitHub Site I have a GitHub site and I hate web-crawlers that disobey or ignore robots.txt. How would I set up a Spider-Trap on a GitHub site that the robots.txt disallows

: How to Edit or Remove robots.txt on a WordPress powered website When using Google's fetch test on my WordPress site, it reports that my robots.txt is blocking pages and resources from being

: Wildcard matches in robots.txt that allow crawling of all JS and CSS are not working To allow the robots to crawl all CSS and JS files we have used the following code: Allow: /*.css$ Allow:

: Blocking every URL with "?" in it Will below Disallow function will block every URL which has "?" in it? Disallow: /*?*PrintPage=yes* I was actually checking few pages with "?" in them and

: SERP's saying meta data is blocked by the robots.txt file after issue resolved Google crawled the robots.txt file when there was a mistake in it and the homepage was blocked. In the Google

: Do I need to add subdomain sitemap / robots.txt separately? Say I have a domain example.com And it has a subdomain sub.example.com And another subdomain internal.example.com I have some links

: Robots.txt blocking homepage on mobile SERP's but not desktop The meta data for the homepage on mobile is stating that the robots.txt file is blocking the page from being crawled. This is not

: Stopping Google from crawling my static domain I use a cookieless sub domain static.example.com to serve all images, js, and css files. This static sub domain has as its root directory the same

: Can I list multiple sitemaps inside one sitemap.xml and specify that in my robots.txt? The sitemap engine I'm using generates a strange sitemap.xml. I need to add blog and shop sitemaps for

: Help with robots meta tag and X-Robots-Tag I have two questions about robots meta tag and X-Robots-Tag: If a page has both X-Robots-Tag HTTP header and robots meta tag, can this cause problems?

: Blocking vs noindex to reduce crawl requests I observed that GoogleBot is making a lot of duplicate requests for the same URLs from my website within a week. Amongst these requests a majority

: Block third domain from being indexed We have a web application that handles many websites that have test and production environment. This application has only one folder and it's hosted with

: Allow bots to read dynamically injected content I've got a pretty large Angular SPA, and I currently use ?_escaped_fragment_ to serve up static versions of all our pages. I've discovered, however,

: Effect of incomplete Disallow rule in robots.txt file Solved: Pages were being blocked by meta robots deliberately A lot of pages are being blocked in the robots.txt file and when I checked

: Not able to Open Website from Google Search My website http://www.softnice.com (WARNING: Possibly hacked site) If I search in Google for "softnice" or "www.sofnice.com" I am not able to open my

: Query String Pages - Do You Block Them via Robots.txt? Or Canonicals, `rel="prev"/rel="next"` do the Job can someone please give me an advice on how to deal with query string pages like /index?page=323,

: Google claims not to find my robots.txt Like I said in the title, the Google Webmaster Console claims not to find the robots.txt file of my homepage and therefore it won't crawl it. But there

: Google Search Console error with path to sitemap.xml I have site-shop and the platform created a sitemap.xml automatically. But in Google Search Console there is an error because robots.txt like

: Hiding search pages via robots.txt I have full text search engine installed on my site, accessible from URL like /search/<search-query>, and search landing in /search, containing large "Search"

: How can I re-crawl my eCommerce website within 24 hrs My travel eCommerce type website, I already add site map in webmaster & daily resubmitted it. Also do the fetch as google but still

: Robots.txt with only Disallow and Allow directives is not preventing crawling of disallowed resources I have a robots.txt file: User-agent:* Disallow:/path/page Disallow:/path/ Allow:/ The disallowed

: Page blocked by robots.txt showing up in site: search results with a description that is a mix of Chinese, English, and German I found a strange search result for a resource blocked by robots.txt.

: Google Webmasters Blocked Resources/Robots.txt File? For some reason Google Webmaster Tools insists that pretty much all js and css resources are blocked on every page of my website and when I

: Allow crawlers to pass link juice from similar and duplicated user created pages My users commonly create new pages with similar/duplicate content. For example: http://example.com/post?123 http://example.com/post?456

: Using asteriks in robots.txt Can spiders crawl the sites if there are no asterisks used on the robots.txt? For example: User-agent: * Disallow: .*/ping/.* then User-agent: * Disallow: /ping/ Please

: Applebot not crawling sitemap.xml On our website we noticed that the Applebot is not crawling our sitemap.xml, so it is unaware of most of our internal webpages. We have no robots.txt restrictions

: Will Google remove links restricted by the robots.txt from the soft 404 list? I have updated a shop system to a newer version. I now get huge number of soft 404 errors in the Webmaster Tools

: How do I edit robots.txt on a Google site? I need to make changes to someone's robots.txt file, but their site is managed by Google (so no FTPing). I have full access to the site via the

: Index site via domain name, but not via direct IP For some context, I'm modifying somebody's website. This particular site is not yet published - it is accessible via direct IP. Problem: Google
Next Page
Terms of Use Create Support ticket Your support tickets Powered by ePowerPress Stock Market News! Top Seo SMO © vmapp.org2023 All Rights reserved.