Vcoderz Community

Vcoderz Community (http://forum.vcoderz.com/index.php)
-   Computers & Information Technologies (http://forum.vcoderz.com/forumdisplay.php?f=6)
-   -   The robots.txt file! (http://forum.vcoderz.com/showthread.php?t=9291)

god 08-06-2007 11:31 AM

The robots.txt file!
 
I thought i'd give you guys an idea about this file!
As we all know, the google crawler engine (the one that gets files into the google database, so you can search them) has access to pages in websites, that you don't have access to ! like a /admin/ folder that's .htaccess'd (password protected)
now website owners dont want people to know the content of these directories, so they .htaccess them (password protect them), but with a google search + google cache, u can still see the contents! so they should stop google from accessing these directories.. how ? in the site's main, they make a file called robots.txt
for example http://www.********.com/robots.txt
the robots.txt file looks like this:
Code:

# everything after a "#" is not taken into consideration
# you can write anything here !
# ammouna
User-agent: * <<< specifies which user agent should be allowed
Disallow: /cgi-bin/
Disallow: /tmp/
Disallow: /cache/
Disallow: /class/
Disallow: /images/
Disallow: /include/
Disallow: /install/
Disallow: /kernel/
Disallow: /language/
Disallow: /templates_c/
Disallow: /themes/
Disallow: /uploads/

so these directories wont be shown by the google search, and are secure!
Now how you can actually use this info ? it depends :D it can be quite useful, and sometimes meaningless!

the most famous robots.txt file on the net is ..... uh ... nsita lol ill update it later..

ya i found it :D here it is www.whitehouse.gov/robots.txt
check it out its ok to open it :P

OS7 08-06-2007 01:21 PM

Re: The robots.txt file!
 
MAn what do u meen by this:S ma fhemet chi
Can you please tell us what do u meen directly? and thank you

SysTaMatIcS 08-06-2007 01:32 PM

Re: The robots.txt file!
 
so wats the use of the robot.txt , we wont have access to the files , i dont get it

Krazy 08-06-2007 01:39 PM

Re: The robots.txt file!
 
As I understood...
If you add it in your website, then google search won't access the files that are in the folders mentioned.

But I think we can access the pages if we directly try to access the folder.
So you'll know what they don't want google search access those folders.

Justin 08-06-2007 01:41 PM

Re: The robots.txt file!
 
systa... its not .exe :o :o wtf?! :|

it's a .txt file... a notepad file:S.. let's call it.. gaining useful infos... & a big thk u for God for these amazing posts & ideas that without him we wouldn't learn them...

god 08-06-2007 02:09 PM

Re: The robots.txt file!
 
Quote:

Originally Posted by systamatics (Post 90822)
so wats the use of the robot.exe , we wont have access to the files , i dont get it

in reply to that:
Quote:

Originally Posted by god (Post 90802)
Now how you can actually use this info ? it depends :D it can be quite useful, and sometimes meaningless!

Figure that out on yourself :P as i said it depends, if you're trying to find something in a website, this file might give u a better understanding of the structure of the site, and maybe tell you what CMS the website is using... look i cant tell u what to do letter by letter that would be illegal :P + i wont do it hehe


All times are GMT +1. The time now is 10:17 AM.

Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2026, Jelsoft Enterprises Ltd.
Ad Management plugin by RedTyger