How to tell bots to forget a website and reindex it from scratch
Clash Royale CLAN TAG#URR8PPP
I was wondering if there was anything I could add to robots.txt or sitemap to tell bots to completely forget everything they know about a website and index it from scratch?
Context: After replacing a website done in CMS-x with a new one done in CMS-y, 99% of pages/links/resources will be gone or moved to different locations, and even though there are proper 404/410 redirects in place, it still would be better if any bots indexing the website would not try to access old stuff.
Basically this: How to tell google a blog article has been updated? but site wide
sitemap indexing robots.txt redesign
add a comment |
I was wondering if there was anything I could add to robots.txt or sitemap to tell bots to completely forget everything they know about a website and index it from scratch?
Context: After replacing a website done in CMS-x with a new one done in CMS-y, 99% of pages/links/resources will be gone or moved to different locations, and even though there are proper 404/410 redirects in place, it still would be better if any bots indexing the website would not try to access old stuff.
Basically this: How to tell google a blog article has been updated? but site wide
sitemap indexing robots.txt redesign
2
I'd also recommend you read support.google.com/webmasters/answer/1663419?hl=en. It explains "If you recently changed your site and now have some outdated URLs in the index, Google's crawlers will see this as we recrawl your URLs, and those pages will naturally drop out of our search results. There's no need to request an urgent update."
– Trebor
Jan 22 at 16:39
1
404/410 are not "redirects." They are errors, which will grossly inconvenience your users.
– Kevin
Jan 22 at 22:00
@Kevin: You mean you haven't put a redirect on a 404? It has the amusing property of inconveniencing the bots but users don't notice. (In the old days a 404 error redirected to the 404 page; so redirects on 404 were honored.)
– Joshua
Jan 22 at 22:04
@Joshua My 404 Redirect to a proper 404 page with a human-readable explanation and a search function
– Pit
Jan 23 at 6:34
@Trebor unfortunately there are more bots. Google alone would not be a problem, I already submitted the new sitemap to them
– Pit
Jan 23 at 6:36
add a comment |
I was wondering if there was anything I could add to robots.txt or sitemap to tell bots to completely forget everything they know about a website and index it from scratch?
Context: After replacing a website done in CMS-x with a new one done in CMS-y, 99% of pages/links/resources will be gone or moved to different locations, and even though there are proper 404/410 redirects in place, it still would be better if any bots indexing the website would not try to access old stuff.
Basically this: How to tell google a blog article has been updated? but site wide
sitemap indexing robots.txt redesign
I was wondering if there was anything I could add to robots.txt or sitemap to tell bots to completely forget everything they know about a website and index it from scratch?
Context: After replacing a website done in CMS-x with a new one done in CMS-y, 99% of pages/links/resources will be gone or moved to different locations, and even though there are proper 404/410 redirects in place, it still would be better if any bots indexing the website would not try to access old stuff.
Basically this: How to tell google a blog article has been updated? but site wide
sitemap indexing robots.txt redesign
sitemap indexing robots.txt redesign
edited Jan 23 at 12:31
Community♦
1
1
asked Jan 22 at 13:44
PitPit
669212
669212
2
I'd also recommend you read support.google.com/webmasters/answer/1663419?hl=en. It explains "If you recently changed your site and now have some outdated URLs in the index, Google's crawlers will see this as we recrawl your URLs, and those pages will naturally drop out of our search results. There's no need to request an urgent update."
– Trebor
Jan 22 at 16:39
1
404/410 are not "redirects." They are errors, which will grossly inconvenience your users.
– Kevin
Jan 22 at 22:00
@Kevin: You mean you haven't put a redirect on a 404? It has the amusing property of inconveniencing the bots but users don't notice. (In the old days a 404 error redirected to the 404 page; so redirects on 404 were honored.)
– Joshua
Jan 22 at 22:04
@Joshua My 404 Redirect to a proper 404 page with a human-readable explanation and a search function
– Pit
Jan 23 at 6:34
@Trebor unfortunately there are more bots. Google alone would not be a problem, I already submitted the new sitemap to them
– Pit
Jan 23 at 6:36
add a comment |
2
I'd also recommend you read support.google.com/webmasters/answer/1663419?hl=en. It explains "If you recently changed your site and now have some outdated URLs in the index, Google's crawlers will see this as we recrawl your URLs, and those pages will naturally drop out of our search results. There's no need to request an urgent update."
– Trebor
Jan 22 at 16:39
1
404/410 are not "redirects." They are errors, which will grossly inconvenience your users.
– Kevin
Jan 22 at 22:00
@Kevin: You mean you haven't put a redirect on a 404? It has the amusing property of inconveniencing the bots but users don't notice. (In the old days a 404 error redirected to the 404 page; so redirects on 404 were honored.)
– Joshua
Jan 22 at 22:04
@Joshua My 404 Redirect to a proper 404 page with a human-readable explanation and a search function
– Pit
Jan 23 at 6:34
@Trebor unfortunately there are more bots. Google alone would not be a problem, I already submitted the new sitemap to them
– Pit
Jan 23 at 6:36
2
2
I'd also recommend you read support.google.com/webmasters/answer/1663419?hl=en. It explains "If you recently changed your site and now have some outdated URLs in the index, Google's crawlers will see this as we recrawl your URLs, and those pages will naturally drop out of our search results. There's no need to request an urgent update."
– Trebor
Jan 22 at 16:39
I'd also recommend you read support.google.com/webmasters/answer/1663419?hl=en. It explains "If you recently changed your site and now have some outdated URLs in the index, Google's crawlers will see this as we recrawl your URLs, and those pages will naturally drop out of our search results. There's no need to request an urgent update."
– Trebor
Jan 22 at 16:39
1
1
404/410 are not "redirects." They are errors, which will grossly inconvenience your users.
– Kevin
Jan 22 at 22:00
404/410 are not "redirects." They are errors, which will grossly inconvenience your users.
– Kevin
Jan 22 at 22:00
@Kevin: You mean you haven't put a redirect on a 404? It has the amusing property of inconveniencing the bots but users don't notice. (In the old days a 404 error redirected to the 404 page; so redirects on 404 were honored.)
– Joshua
Jan 22 at 22:04
@Kevin: You mean you haven't put a redirect on a 404? It has the amusing property of inconveniencing the bots but users don't notice. (In the old days a 404 error redirected to the 404 page; so redirects on 404 were honored.)
– Joshua
Jan 22 at 22:04
@Joshua My 404 Redirect to a proper 404 page with a human-readable explanation and a search function
– Pit
Jan 23 at 6:34
@Joshua My 404 Redirect to a proper 404 page with a human-readable explanation and a search function
– Pit
Jan 23 at 6:34
@Trebor unfortunately there are more bots. Google alone would not be a problem, I already submitted the new sitemap to them
– Pit
Jan 23 at 6:36
@Trebor unfortunately there are more bots. Google alone would not be a problem, I already submitted the new sitemap to them
– Pit
Jan 23 at 6:36
add a comment |
1 Answer
1
active
oldest
votes
That isn't possible. You need to map your old URLs to the new with redirects for SEO and user experience.
Google never forgets about old URLs, even after a decade. When you migrate to a new CMS, you need to implement the page level redirects
If there is no equivalent for some particular page you can let it 404 and Google will remove it from the index. Using "410 Gone" instead gets Google to drop the URLs from the index as soon as they are crawled without the 24 hour grace period that Google uses for "404 Not Found."
There is no directive to tell bots to forget about an old site either in search console or robots.txt.
What if you don't redirect?
It may be too much work to redirect, or your new CMS may not make the redirect implementation easy.
If you choose not to implement the redirects it will be something like starting over. Google will see that your old URLs return 404 status and it will remove them from the search index.
Your new URLs will eventually get indexed, but it may take a while. Changing all your URLs without redirects is a big sign that your site isn't stable and can't be trusted. All your rankings will be lost and your site will start over.
Googlebot will continue to crawl the old URLs for years. For it, hope springs eternal that you may someday put those pages back up.
If you do redirect, all your inbound links, users' bookmarks, and most of your current rankings will be preserved.
Why?
So why don't search engines have a "reset" button? Because there are almost always better options. In your case it is much better to redirect.
In the case when a site is penalized, Google doesn't provide a reset button because that might remove all penalties.
How?
So how do you implement the redirects? You need a list of your old URLs. You may have a sitemap from your old site that you can start with. You can also get the list from your server logs, Google Analytics, or even from Google Search console.
If you planned ahead, your URLs in your new CMS will be similar and you can implement a rewrite rule to handle them. If there is a pattern between the old and new URLs, it can be a one liner in a .htaccess file to issue the redirects for the entire site.
If you have to manually find the new URLs and map thousands of them one by one, you could look into RewriteMap
functionality.
1
Thanks. As it is a complete redesign, with most of the old stuff gone an only a few old pages being somehow represented in new one, I will redirect those, and for the rest have to rely on google doing it's business …
– Pit
Jan 22 at 16:29
1
While the codes don't work well for Google, search and SEO, there may be cases where you want to use "303 See Other" and "300 Multiple Choice" redirects.
– Stephen Ostermiller♦
Jan 22 at 16:58
add a comment |
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "45"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fwebmasters.stackexchange.com%2fquestions%2f120398%2fhow-to-tell-bots-to-forget-a-website-and-reindex-it-from-scratch%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
That isn't possible. You need to map your old URLs to the new with redirects for SEO and user experience.
Google never forgets about old URLs, even after a decade. When you migrate to a new CMS, you need to implement the page level redirects
If there is no equivalent for some particular page you can let it 404 and Google will remove it from the index. Using "410 Gone" instead gets Google to drop the URLs from the index as soon as they are crawled without the 24 hour grace period that Google uses for "404 Not Found."
There is no directive to tell bots to forget about an old site either in search console or robots.txt.
What if you don't redirect?
It may be too much work to redirect, or your new CMS may not make the redirect implementation easy.
If you choose not to implement the redirects it will be something like starting over. Google will see that your old URLs return 404 status and it will remove them from the search index.
Your new URLs will eventually get indexed, but it may take a while. Changing all your URLs without redirects is a big sign that your site isn't stable and can't be trusted. All your rankings will be lost and your site will start over.
Googlebot will continue to crawl the old URLs for years. For it, hope springs eternal that you may someday put those pages back up.
If you do redirect, all your inbound links, users' bookmarks, and most of your current rankings will be preserved.
Why?
So why don't search engines have a "reset" button? Because there are almost always better options. In your case it is much better to redirect.
In the case when a site is penalized, Google doesn't provide a reset button because that might remove all penalties.
How?
So how do you implement the redirects? You need a list of your old URLs. You may have a sitemap from your old site that you can start with. You can also get the list from your server logs, Google Analytics, or even from Google Search console.
If you planned ahead, your URLs in your new CMS will be similar and you can implement a rewrite rule to handle them. If there is a pattern between the old and new URLs, it can be a one liner in a .htaccess file to issue the redirects for the entire site.
If you have to manually find the new URLs and map thousands of them one by one, you could look into RewriteMap
functionality.
1
Thanks. As it is a complete redesign, with most of the old stuff gone an only a few old pages being somehow represented in new one, I will redirect those, and for the rest have to rely on google doing it's business …
– Pit
Jan 22 at 16:29
1
While the codes don't work well for Google, search and SEO, there may be cases where you want to use "303 See Other" and "300 Multiple Choice" redirects.
– Stephen Ostermiller♦
Jan 22 at 16:58
add a comment |
That isn't possible. You need to map your old URLs to the new with redirects for SEO and user experience.
Google never forgets about old URLs, even after a decade. When you migrate to a new CMS, you need to implement the page level redirects
If there is no equivalent for some particular page you can let it 404 and Google will remove it from the index. Using "410 Gone" instead gets Google to drop the URLs from the index as soon as they are crawled without the 24 hour grace period that Google uses for "404 Not Found."
There is no directive to tell bots to forget about an old site either in search console or robots.txt.
What if you don't redirect?
It may be too much work to redirect, or your new CMS may not make the redirect implementation easy.
If you choose not to implement the redirects it will be something like starting over. Google will see that your old URLs return 404 status and it will remove them from the search index.
Your new URLs will eventually get indexed, but it may take a while. Changing all your URLs without redirects is a big sign that your site isn't stable and can't be trusted. All your rankings will be lost and your site will start over.
Googlebot will continue to crawl the old URLs for years. For it, hope springs eternal that you may someday put those pages back up.
If you do redirect, all your inbound links, users' bookmarks, and most of your current rankings will be preserved.
Why?
So why don't search engines have a "reset" button? Because there are almost always better options. In your case it is much better to redirect.
In the case when a site is penalized, Google doesn't provide a reset button because that might remove all penalties.
How?
So how do you implement the redirects? You need a list of your old URLs. You may have a sitemap from your old site that you can start with. You can also get the list from your server logs, Google Analytics, or even from Google Search console.
If you planned ahead, your URLs in your new CMS will be similar and you can implement a rewrite rule to handle them. If there is a pattern between the old and new URLs, it can be a one liner in a .htaccess file to issue the redirects for the entire site.
If you have to manually find the new URLs and map thousands of them one by one, you could look into RewriteMap
functionality.
1
Thanks. As it is a complete redesign, with most of the old stuff gone an only a few old pages being somehow represented in new one, I will redirect those, and for the rest have to rely on google doing it's business …
– Pit
Jan 22 at 16:29
1
While the codes don't work well for Google, search and SEO, there may be cases where you want to use "303 See Other" and "300 Multiple Choice" redirects.
– Stephen Ostermiller♦
Jan 22 at 16:58
add a comment |
That isn't possible. You need to map your old URLs to the new with redirects for SEO and user experience.
Google never forgets about old URLs, even after a decade. When you migrate to a new CMS, you need to implement the page level redirects
If there is no equivalent for some particular page you can let it 404 and Google will remove it from the index. Using "410 Gone" instead gets Google to drop the URLs from the index as soon as they are crawled without the 24 hour grace period that Google uses for "404 Not Found."
There is no directive to tell bots to forget about an old site either in search console or robots.txt.
What if you don't redirect?
It may be too much work to redirect, or your new CMS may not make the redirect implementation easy.
If you choose not to implement the redirects it will be something like starting over. Google will see that your old URLs return 404 status and it will remove them from the search index.
Your new URLs will eventually get indexed, but it may take a while. Changing all your URLs without redirects is a big sign that your site isn't stable and can't be trusted. All your rankings will be lost and your site will start over.
Googlebot will continue to crawl the old URLs for years. For it, hope springs eternal that you may someday put those pages back up.
If you do redirect, all your inbound links, users' bookmarks, and most of your current rankings will be preserved.
Why?
So why don't search engines have a "reset" button? Because there are almost always better options. In your case it is much better to redirect.
In the case when a site is penalized, Google doesn't provide a reset button because that might remove all penalties.
How?
So how do you implement the redirects? You need a list of your old URLs. You may have a sitemap from your old site that you can start with. You can also get the list from your server logs, Google Analytics, or even from Google Search console.
If you planned ahead, your URLs in your new CMS will be similar and you can implement a rewrite rule to handle them. If there is a pattern between the old and new URLs, it can be a one liner in a .htaccess file to issue the redirects for the entire site.
If you have to manually find the new URLs and map thousands of them one by one, you could look into RewriteMap
functionality.
That isn't possible. You need to map your old URLs to the new with redirects for SEO and user experience.
Google never forgets about old URLs, even after a decade. When you migrate to a new CMS, you need to implement the page level redirects
If there is no equivalent for some particular page you can let it 404 and Google will remove it from the index. Using "410 Gone" instead gets Google to drop the URLs from the index as soon as they are crawled without the 24 hour grace period that Google uses for "404 Not Found."
There is no directive to tell bots to forget about an old site either in search console or robots.txt.
What if you don't redirect?
It may be too much work to redirect, or your new CMS may not make the redirect implementation easy.
If you choose not to implement the redirects it will be something like starting over. Google will see that your old URLs return 404 status and it will remove them from the search index.
Your new URLs will eventually get indexed, but it may take a while. Changing all your URLs without redirects is a big sign that your site isn't stable and can't be trusted. All your rankings will be lost and your site will start over.
Googlebot will continue to crawl the old URLs for years. For it, hope springs eternal that you may someday put those pages back up.
If you do redirect, all your inbound links, users' bookmarks, and most of your current rankings will be preserved.
Why?
So why don't search engines have a "reset" button? Because there are almost always better options. In your case it is much better to redirect.
In the case when a site is penalized, Google doesn't provide a reset button because that might remove all penalties.
How?
So how do you implement the redirects? You need a list of your old URLs. You may have a sitemap from your old site that you can start with. You can also get the list from your server logs, Google Analytics, or even from Google Search console.
If you planned ahead, your URLs in your new CMS will be similar and you can implement a rewrite rule to handle them. If there is a pattern between the old and new URLs, it can be a one liner in a .htaccess file to issue the redirects for the entire site.
If you have to manually find the new URLs and map thousands of them one by one, you could look into RewriteMap
functionality.
edited Jan 22 at 17:00
answered Jan 22 at 13:52
Stephen Ostermiller♦Stephen Ostermiller
67.6k1392249
67.6k1392249
1
Thanks. As it is a complete redesign, with most of the old stuff gone an only a few old pages being somehow represented in new one, I will redirect those, and for the rest have to rely on google doing it's business …
– Pit
Jan 22 at 16:29
1
While the codes don't work well for Google, search and SEO, there may be cases where you want to use "303 See Other" and "300 Multiple Choice" redirects.
– Stephen Ostermiller♦
Jan 22 at 16:58
add a comment |
1
Thanks. As it is a complete redesign, with most of the old stuff gone an only a few old pages being somehow represented in new one, I will redirect those, and for the rest have to rely on google doing it's business …
– Pit
Jan 22 at 16:29
1
While the codes don't work well for Google, search and SEO, there may be cases where you want to use "303 See Other" and "300 Multiple Choice" redirects.
– Stephen Ostermiller♦
Jan 22 at 16:58
1
1
Thanks. As it is a complete redesign, with most of the old stuff gone an only a few old pages being somehow represented in new one, I will redirect those, and for the rest have to rely on google doing it's business …
– Pit
Jan 22 at 16:29
Thanks. As it is a complete redesign, with most of the old stuff gone an only a few old pages being somehow represented in new one, I will redirect those, and for the rest have to rely on google doing it's business …
– Pit
Jan 22 at 16:29
1
1
While the codes don't work well for Google, search and SEO, there may be cases where you want to use "303 See Other" and "300 Multiple Choice" redirects.
– Stephen Ostermiller♦
Jan 22 at 16:58
While the codes don't work well for Google, search and SEO, there may be cases where you want to use "303 See Other" and "300 Multiple Choice" redirects.
– Stephen Ostermiller♦
Jan 22 at 16:58
add a comment |
Thanks for contributing an answer to Webmasters Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fwebmasters.stackexchange.com%2fquestions%2f120398%2fhow-to-tell-bots-to-forget-a-website-and-reindex-it-from-scratch%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
2
I'd also recommend you read support.google.com/webmasters/answer/1663419?hl=en. It explains "If you recently changed your site and now have some outdated URLs in the index, Google's crawlers will see this as we recrawl your URLs, and those pages will naturally drop out of our search results. There's no need to request an urgent update."
– Trebor
Jan 22 at 16:39
1
404/410 are not "redirects." They are errors, which will grossly inconvenience your users.
– Kevin
Jan 22 at 22:00
@Kevin: You mean you haven't put a redirect on a 404? It has the amusing property of inconveniencing the bots but users don't notice. (In the old days a 404 error redirected to the 404 page; so redirects on 404 were honored.)
– Joshua
Jan 22 at 22:04
@Joshua My 404 Redirect to a proper 404 page with a human-readable explanation and a search function
– Pit
Jan 23 at 6:34
@Trebor unfortunately there are more bots. Google alone would not be a problem, I already submitted the new sitemap to them
– Pit
Jan 23 at 6:36