Skip to main content

robot.txt what is this and how to use this


What do they do exactly?

Robot.txt files tell your instructions to a search engine robot..

The first thing a search engine spider looks at when it is visiting a page is the robots.txt file. It looks for it because it wants to know what it should do. If you have instructions for a search engine robot, you must tell it those instructions.
The most common problem people have with robot.txt files is that they don't know how to make them.

If you can make web pages, you can also make a robot.txt file. The file is a text file, which means that you can use notepad, wordpad, or any other plain text editor. You can also make them in Frontpage or Dreamweaver by using the "code" view. You can even "copy and paste" them.

So instead of thinking "I am making a robot.txt file", just think, "I am writing a note" they are the exact same process. However you would write a note or a letter on your computer will work for the robot.txt file.
robot.txt files and search robots

What should the robot.txt say?

That depends on what you want it to do.

Most people want robots to visit everything in their website. If this is the case with you, and you want the robot to index all parts of your site, there are three options to let the robots know that they are welcome.
1) Do not have a robot.txt file
If your website does not have a robot.txt file then this is what happens -
A robot comes to visit. It looks for the robot.txt file. It does not find it because it isn't there. The robot then feels free to visit all your web pages and content because this is what it is programmed to do in this situation.
2) Make an empty file and call it robots.txt
If your website has a robot.txt file that has nothing in it then this is what happens -
A robot comes to visit. It looks for the robot.txt file. It finds the file and reads it. There is nothing to read, so the robot then feels free to visit all your web pages and content because this is what it is programmed to do in this situation.
3) Make a file called robots.txt and write the following two lines in it... (these are "instructions" for the robot to follow)

User-agent: *

Disallow:
If your website has a robot.txt with these instructions in it then this is what happens -

A robot comes to visit. It looks for the robot.txt file. It finds the file and reads it. It reads the first line. Then it reads the second line. The robot then feels free to visit all your web pages and content because this is what it is what you told it to do.

What do the robot instructions mean?

Here is an explanation of what the different words mean in a robot.txt file
User-agent:
The "User-agent" part is there to specify directions to a specific robot if needed. There are two ways to use this in your file.

If you want to tell all robots the same thing you put a " * " after the "User-agent" It would look like this...
User-agent: *
(This line is saying "these directions apply to all robots")

If you want to tell a specific robot something (in this example Googlebot) it would look like this...
User-agent: Googlebot
(this line is saying "these directions apply to just Googlebot")
Disallow:
The "Disallow" part is there to tell the robots what folders they should not look at.

This means that if, for example you do not want search engines to index the photos on your site then you can place those photos into one folder and exclude it.

Lets say that you have put all these photos into a folder called "photos". Now you want to tell search engines not to index that folder.

Here is what your robot.txt file should look like:

User-agent: *
Disallow: /photos

The above two lines of text in your robots.txt file would keep robots from visiting your photos folder. The "User-agent *" part is saying "this applies to all robots". The "Disallow: /photos" part is saying "don't visit or index my photos folder".

Googlebot specific instructions

The robot that Google uses to index their search engine is called Googlebot. It understands a few more instructions than other robots. The instructions it follows are well defined in the Google help pages (see resources below).

In addition to the "User-name" and "Disallow" Googlebot also uses the...
Allow:
The "Allow:" instructions lets you tell a robot that it is okay to see a file in a folder that has been "Disallowed" by other instructions.

To illustrate this, let's take the above example of telling the robot not to visit or index your photos. We put all the photos into one folder called "photos" and we made a robot.txt file that looked like this...
User-agent: *
Disallow: /photos

Now let's say there was a photo called mycar.jpg in that folder that you want Googlebot to index. With the Allow: instruction, we can tell Googlebot to do so, it would look like this...

User-agent: *
Disallow: /photos
Allow: /photos/mycar.jpg
This would tell Googlebot that it can visit "mycar.jpg" in the photo folder, even though the "photo" folder is otherwise excluded.
Testing your robot.txt file
If you are using a Google sitemap as part of their webmaster tools, then you can log in and see if Google is having any issues crawling your site. There is also a robot.txt tool that allows you to experiment a little, letting you know if their are any problems with your file prior to putting it online.

Key Concept:


- If you use a robots.txt file, make sure it is correctly written because an incorrect robots.txt file can block the bots that index your website.

Comments

Popular posts from this blog

13 websites to register your free domain

Register your Free Domain Now!! 1)  .tk Dot TK is a FREE domain registry for websites on the Internet. It has exactly the same power as other domain extensions, but it’s free! Because it’s free, millions of others have been using .TK domains since 2001 – which makes .TK powerful and very recognizable.  Your website will be like www.yourdomainname.tk . It is free for 1 year. It’s a ccTLD domain whixh having the abbreviation  Tokelau. To create a .tk domain, Visit   www.dot.tk 2) co.cc Co.cc is completely free domain which is mostly used by blogspot bloggers because of it’s easy to use DNS system. Creating a co.cc for blogger is simple ( for instructions- “click here”). Your website will be like www.yourdomainname.co.cc . To create a .co.cc domain, visit www.co.cc 3)   co.nr co.nr is too like co.cc. Your website will be like  www.yourdomainname.co.nr . You can add it for blogger also.. To create a .co.cc domain, vi...

How to Check Which Type of Processor Your Android Device Have?

#1 First of all download and install the app named Droid Hardware Info from the Google Play Store. Just search for this app and then click on the install button aside to it to start the installation process, after that let the app be downloaded and installed. Some permissions would be asked before you start to install the app just grant all those and move right away with the installation. #2 Open the newly installed app and inside the app head towards to the System tab and you would see there the two fields named CPU Architecture and Instruction Sets. Open up these fields and surf through these, you would get much much information regarded to the processor but you might not be able to read it as such. Just follow up the method and we would help you decode that information of your Android device processor. #3 Essentially the ARM: ARMv7 or armeabi, ARM64: AArch64 or arm64 and the x86: x86 or x86abi is the decoded information for your processor architecture that you might...

how to Send a Confirmation Email Upon Form Submission-Woofoo

When someone successfully submits an entry, you can automatically send them a confirmation email to let them know. You can customize the email to include any follow-up info you'd like, and you can choose to include a copy of their entry in the email as well. To set up confirmation emails in Form Settings: Log in and go to  Forms . Hover over  Edit  next to the form you want to edit. Choose  Edit form . Click the  Form Settings  tab. Under Confirmation Options, select  Send Confirmation Email to User . From the  Send To  dropdown, select an Email field from your form. We'll send the confirmation email to the email address the person filling out your form entered into this field. If the dropdown says "No Email Fields Found", add an  Email  field to your form. In the  Reply To  textbox, enter the reply-to email—if someone replies to their confirmation email, this is the email address that their reply will be s...

How to Show JavaScript or HTML Code on Blogger Blog Posts

How to Display Code on Blogger Posts In order to display codes on blogger blog you should convert them to escaped characters and show them under the HTML tag   pre  as shown below. code in escaped characters Now let us check how to convert a script in to escaped form.  How to Convert HTML/Java Script code in to Escaped Characters You can simply parse a script in to escaped format by following the changes given below. <  must be changed with   < >  should be changed with  > “  should be changed with  " In case if the script is too big to convert manually, you can use any tools which convert a Java script or HTML script in to escaped version. The following links will help you to convert normal HTML and JavaScript codes in to Escaped Characters automatically.  http://www.accessify.com/tools-and-wizards/developer-tools/quick-escape/default.php http://codeformatter.blogspot.in/2009/06/about-code-formatter.h...

python program to Print Starting Series OF Indian Mobile Number for a State or operator or both

import requests import urllib.request import time from bs4 import BeautifulSoup as bs import re url = ' https://en.wikipedia.org/wiki/Mobile_telephone_numbering_in_India' state_to_extract = "UE" #if set to None all state is considered telecom_to_extracted = None #if set to none all operator from particular city is extracted response = requests . get(url) print (response) soup = bs(response . text, "html.parser" ) one_a_tag = soup . findAll( 'tr' )[ 35 :] lst = [] for k in one_a_tag: s = k . findAll( 'td' ) limit = len (s) i = 0 while True : if i == limit: break no = s[i] . text i += 1 if i == limit: break operator = s[i] . text i += 1 if i == limit: break state = s[i] . text i += 1 if i == limit: break res = f "{no} {operator} {state}" if state_to_extract is None : if telecom_to_extracted is None : lst . append(no) elif telecom_to_e...

Download Complete Websites For Offline Access

there  are the various tool available on the internet to download a complete site .. with the following tool you can download a complete site or a particular section of a site: 1.Internet Download manager : In the internet download manager, you can use Site Grabber option to download a site. this is what I mostly use ..some other alternatives are. Getleft Getleft   has a new, modern feel to its interface. Upon launch, press   “Ctrl + U”   to quickly get started by entering an URL and save directory. Before the download begins, you’ll be asked which files should be downloaded. We are using Google as our example, so these pages should look familiar. Every page that’s included in the download will be extracted, which means every file from those particular pages will be downloaded. Once begun, all files will be pulled to the local system like so: DOWNLOAD GETLEFT PageNest DOWNLOAD PAGENEST Cyotek WebCopy ...

get free domain for blogger

Get free domain for blogger 1.steps for Blogger= 1. type your blog address and your email id and using comment form 2. Goto your blog ,sign in and then goto layout{for modern interface}or Page element{old blogger interface} 3.click on add new gadget 4. goto html/java script gadget click there 5. copy this codes <a href='http://www.freedomain.co.nr/' title='Free Domain Name'><img alt='Free Domain Name' src='http://szsmnua.imdrv.net/soof62.gif' style='width:88px;height:31px;border:0;'/></a> it look like this 6.and click on save 7. then again follow steps 3-5 8.then now copy this codes <a href="http://vastgk.blogspot.in/" target="_blank"><img border="0" alt="Tips for New Bloggers" width="120" src="http://i154.photobucket.com/albums/s255/ownlblog/tipsbanner80x15.gif" height="15"/> </a> itlook like this 9. click on save ...

FIXED: Feedjit gadgets no longer work on my blog!!! help

well if your feedjit tracker is not working than you can easily fix this... goto blogger.com Go to  Dashboard -> Settings And set  HTTPS Redirect  to  No The gadgets will work now for  http ://yourblog.blogspot.com They won't for  https ://yourblog.blogspot.com because their connection isn't secure. this is not a permanent fix ... the gadget developers will have to make their gadgets https ready before Blogger will (if at all) become https only.

HOw to hack face book real working By hackingloop.com{shared by me}

Requirement to hack someone's Facebook account: 1. Victim (whose  Facebook account password  you wanna hack) should be on Facebook. 2. Create four to five fake  Facebook accounts (three are sufficient but one more for bonus). I will advice you that create accounts with girl names and put an awesome girls photograph. Fill the basic profile.. Why i am saying create account with Girl names is just because Hungry boys accepts girls  friend  request without any delay. And if you know the person personally then create account with names of his near ones and say that you have created new profile so add you as a friend. Note all the three to four fake accounts should not be  friends  or any relationship with each other. 3. Most important requirement you need to add all above three account to the friends list of victim whose  Facebook account  you want to hack. Above method will be helpful for that :P. 3. ...

DOWNLOAD CODE BLOCKS 16.01 MINGW.SETUP .EXE 86.3 MB

Code::Blocks for Mac is a free C, C++ and Fortran IDE that has a custom build system and optional Make support. The application has been designed to be very extensible and fully configurable. Code::Blocks is an IDE packed full of all the features you will need. It has a consistent look, feel and operation across its supported platforms. It has been built around a plugin framework, therefore Code::Blocks can be extended with plugins. Support for any kind of functionality can be added by installing/coding a plugin. Key features include: Written in C++. No interpreted languages or proprietary libs needed.. Full plugin support. Multiple compiler support: GCC (MingW / GNU GCC), MSVC++, clang, Digital Mars, Borland C++ 5.5, and Open Watcom etc. Support for parallel builds. Imports Dev-C++ projects. Debugger with full breakpoints support. Cross-platform. Code::Blocks' interface is both customizable and extensible with Syntax highlighting, a tabbed interface, Class Br...