Robots.txt - Exclude any URL containing "/ node /"

How do I tell scanners / bots not to index the URL that / node / pattern has? The following happens from day one, but I noticed that Google still indexed a lot of URLs, / node / in it, for example. www.mywebsite.com/node/123/32

Deny: / node /

Is there anything that indicates that the URL that / node / has is not indexed Should I write something like the following: Disallow: / node / *

Update: The real problem, despite: Deny: / node / In the robots.txt file, Google indexes pages under this URL, for example. www.mywebsite.com/ node / 123/32

/ node / is not a physical directory, since drupal 6 shows its contents, I think this is my problem that node is not a directory, but just a part of the URLs drupal creates for content, how can I handle this? will it work?

Deny: / * node

thank

+3
source share
3 answers

Disallow: /node/will reject any URL starting with /node/(after the host). An asterisk is not required.

Therefore, it will block www.mysite.com/ node / bar.html, but it will not block www.mysite.com/foo/node/bar.html.

If you want to block everything that contains /node/, you need to writeDisallow: */node/

, Googlebot robots.txt 7 . , robots.txt, , , Googlebot robots.txt. .

+5

Disallow: /node/* - , . robots.txt, * " ". . Google robots.txt.

, , - HTTP- . , htaccess node:

Header set x-robots-tag: noindex
0

Disallow . , Googlebot, robots.txt, .

:

URL- Google, robots.txt. : http://support.google.com/webmasters/bin/answer.py?hl=en&answer=156449 ( "... Google , robots.txt, URL-, ".). .

-, Google - (https://www.google.com/webmasters/tools/home?hl=en), , Health → " Google", , . ( , robots.txt ?)

, Bing : http://www.bing.com/webmaster/help/fetch-as-bingbot-fe18fa0d. , Google, Bing .. .

, , .

0
source

All Articles