What Licenses and Permits are Required to Start a BusinessDecember 8, 2019
Trademark 101: The Complete GuideDecember 8, 2019
Millions of websites make desperate attempts to be noticed and indexed by Google so why would anyone ever want to remove a link from Google search? There are a few reasons for someone to do this.
Maybe some information about a new product or service got leaked too early, maybe Google showed a page that you deleted one week ago in search results, or may Google showed an outdated version of a page instead of the updated one. Whatever the reason may be, there are a couple of ways to accomplish this.
Add the robots meta-tag with the value “NOINDEX” inside the head section of the page you don’t want to appear in Google Search.
This will prevent the page from being indexed by Google and other search engines altogether, permanently.
The HTML code for the robots meta-tag looks like this.
<meta name="robots" content="noindex">
Copy-paste the code inside the head section of the page you wish to remove.
If a page already has the robots meta-tag with the value set to “index”, change it to “noindex” and you are done.
Now you just have to wait until search engines re-crawl the page and remove it from their search index.
This is by far the most efficient way of removing or blocking a URL from crawlers.
robots.txt is a file you create in the root directory of your WordPress installation. It is used to tell search engines which parts of the website are not allowed to be crawled.
However, there are some limitations and risks to using robots.txt as a URL blocking tool. They are…
Crawlers are not obligated to heed the instructions in robots.txt
robots.txt can’t force crawlers to heed its instructions. It’s just letting them know that the webmaster is not happy with certain pages being crawled. It’s totally up to them to decide whether or not to respect those instructions.
Of course Google crawlers and other well-known ones will obey these rules but others may not. That’s the main reason why people prefer to password-protect their pages instead.
A robotted page can still be indexed if linked to from other sites
Even though Google crawlers comply with the instructions in a site’s robots.txt file, they may still crawl and index a disallowed URL if other websites have linked to that page.
Therefore, certain details such as the URL, page title and meta description will be available to the public and may appear in SERPs.
To block URLs leading to confidential information that you don’t want to appear in Google search results, just store all those pages inside a password-protected directory on your web server.
This way you are blocking off not only search engine crawlers but also crawlers from third-parties. Therefore, this is the most secured and effective way to block private URLs.
Google Search Console
If you own or control the website that contains the information you wish to remove, first delete the page, and then access Google Search Console, a website management console provided by Google to webmasters.
It contains useful tools and statistics that can help you rank higher on Google. If you haven’t already, sign up for a Google Search Console account now.
The following conditions must be met to get Google to remove any data.
- The page you wish to remove should already be indexed by Google.
- The page you wish to remove should already be deleted or had its URL changed.
- The website on which the page was hosted should be verified with Google Search Console.
Let’s now take a look at some scenarios where you can use Google’s URL removal tool.
- Removing websites under-construction from Google
Developing a website may take a considerable amount of time depending on the scale and its functionalities. People don’t want Google or other search engines to crawl, index, and make their site public even before the development process is complete.
Google can detect the existence of your pages when you share their URLs on Facebook, Skype or any other social media including email.
Before you take steps to remove information from Google, be sure to delete or rename the pages that you don’t want Google to show.
After you rename those pages, be sure to add password-protection to them. Password-protected pages won’t show up in search engines since they are disabled in the robots.txt file.
Next, you should send a URL removal request to Google.
To do this, first login to Google Search Console and navigate to “Google Index” > “Remove URLs”. Click on the “Temporary hide” button and type in the exact URL of the page you wish to remove. Note that it’s case-sensitive.
Google will then prompt you to choose the reason for removal. Choose “Temporarily hide page from search results and remove from cache” and click on “Submit Request”.
It may take a couple of days before Google actually processes this request. You will be able to see the status of the removal request from the same page.
- Removing leaked information
Let’s say you are creating a new section in your site and don’t want search engines to find it until it’s complete. You may not have included the link to the new section inside your menu but you may have linked to it from an older page which is already indexed. This makes it really easy for Google to crawl and index your page.
Therefore, make sure that the pages belonging to the new section are not publically available by renaming and password-protecting them. Also, file a request to remove the link from Google’s database.
- Removing cached pages
Google takes a snapshot of every webpage it index to use in situations where the original page is not available. Cached links will show you what the page looked like the last time Google crawled it.
What if Google shows the correct information in SERPs but takes visitors to a cached version of the page when clicked? How do we tell Google to update the cache of a page?
It’s really simple. Just go to Google Search Console and navigate to “Google Index” > “Remove URLs”. Type in the URL of the page you wish to clear the cache of and select “Remove page from cache only” from the drop-down menu which indicates the reason of removal. Then click on “Submit Request”.
For how long will the URL be blocked?
It’s not permanent of course. The site or page you removed won’t appear in search results for at least 90 days. However, if the same site or page is still available and accessible after 90 days have gone, Google may try to re-index it.
If you want a permanent solution, use the robots meta-tag.
When should I not use Google’s URL removal tool?
According to Google, the URL removal tool should only be used for emergencies. For example, it’s ok to use the tool to remove a page containing confidential information. However, using the tool in any of the following scenarios is not recommended.
- As a fix for 404 pages. Do not use the tool to remove any 404 pages since Google’s crawlers would naturally drop them out of search results when re-crawled.
- To cover tracks. If you want to start clean with a domain name you purchased from someone else, this tool is not for you. Instead, file a reconsideration request directly to Google’s team.
- To take your site offline after being hacked. By all means, you can use the tool to remove the pages hackers created but don’t try to take down your entire site. Just clean up the hacked pages and wait for Google to re-index your site.
- To block certain variations of URLs. Don’t try to block the www-versions of your URLs. This may result in removing everything, including the non-www versions.
How do I cancel a URL removal request?
Let’s say you finished updating the site and want Google to re-index your pages before 90 days have gone by. Then what? Well you can easily cancel the removal request by heading over to Google Search Console.
Navigate to “Google Index” > “Remove URLs”. There you will see a list of all pending URL removal requests. To see a list of all removed pages, select “Show: Removed” from the drop-down menu above the table.
Identify the link you wish to re-index and click on the “Reinclude” button. The changes will be reflected within a few hours or days.
As you can see there are several ways to block or remove unwanted URLs from search engines.
Personally, I do not like the robots.txt procedure since it’s the least reliable method of them all. Crawlers that don’t follow the instructions could easily crawl and index the disallowed URLs.
I prefer using the noindex robots meta-tag instead, especially since it’s a permanent solution.
No matter which procedure you follow, you would still need to apply for a URL removal request from Google Search Console.
The bottom line is, keep your priorities straight and only send removal requests to Google if you have a page that needs to be removed as soon as possible. Otherwise, just create a 301 redirect for the deleted page.
Did you face any problems while trying to remove or block a URL from Google? If so, what are they? Let us know in the comment section below.