# If the Joomla site is installed within a folder such as at # e.g. www.example.com/joomla/ the robots.txt file MUST be # moved to the site root at e.g. www.example.com/robots.txt # AND the joomla folder name MUST be prefixed to the disallowed # path, e.g. the Disallow rule for the /administrator/ folder # MUST be changed to read Disallow: /joomla/administrator/ # # For more information about the robots.txt standard, see: # http://www.robotstxt.org/orig.html # # For syntax checking, see: # http://www.sxw.org.uk/computing/robots/check.html # # # sintaxis, explicacion: # deshabilito q entre el agent * en los directorios q no permito, q son todos los q tienen disallow: (path). # cada regla se ejecuta en bloque: se indica el agente y lo que permito o bloqueo # # permito explicitamente agent de twitter y google, G+, fb, linkedin entre en todo no pongo nada en Disallow ( no hay restricciones). # User-agent: Twitterbot User-Agent: google User-Agent: googlebot User-Agent: facebookexternalhit/1.1 User-Agent: LinkedInBot/1.0 User-agent: Pinterest Disallow: #Crawl delay: 10 # # regla q sobreescribe a la anterior, es para todos los agents - el asterisco- y NO permito q entren a los directorios que pongo: # User-agent: * Disallow: /_xcopias/ Disallow: /_country/ Disallow: /administrator/ Disallow: /cache/ Disallow: /cli/ Disallow: /components/ Disallow: /images/ Disallow: /includes/ Disallow: /installation/ Disallow: /language/ Disallow: /libraries/ Disallow: /logs/ Disallow: /media/ Disallow: /modules/ Disallow: /plugins/ Disallow: /templates/ Disallow: /tmp/ Disallow: /upload/ Disallow: /upload_p/ Disallow: /x.samples/ Disallow: /xcache_images_apps/ # # NO permito este crawler.INICIO # User-agent: BLEXBot Disallow: / # # Y extiendo no permitir a ningun otro robot, (asterisco) # User-agent: * Disallow: /