What does /*.php$ mean in robots.txt? -
i came across site uses following in robots.txt file:
user-agent: * disallow: /*.php$
so do? prevent web crawlers crawling following urls?
https://example.com/index.php https://example.com/index.php?page=events&action=upcoming
will block subdomains too?
https://subdomain.example.com/index.php
so do?
by spec means "urls starting /*.php$
", isn't useful. there might engines out support custom syntax it. know support wild cards, looks regular expression syntax , i've not heard of supports in robots.txt.
will prevent web crawlers crawling following urls?
by spec: no.
if supports regexs, block first 1 not second one.
will block subdomains too?
no. each origin independent when comes robots.txt. subdomain site need own copy of resource.
Comments
Post a Comment