Perl regex extracting domain name from URL http://stackoverflow.com/questions/11875630/how-do-i-get-the-host-name-from-a-url-in-perl-using-regex code.use strict; use warnings; use URI::Split qw/ uri_split uri_join /; my $scheme_host = do { my (@parts) = uri_split 'http://linux.pacific.net.au/primary.xml.gz'; uri_join @parts[0,1]; }; print $scheme_host; code.. If cannot install module: code.use strict; use warnings; my $url = 'http://linux.pacific.net.au/primary.xml.gz'; my ($scheme_host) = $url =~ m|^( .*?\. [^/]+ )|x; print $scheme_host; code.. outputs: http://linux.pacific.net.au br. http://stackoverflow.com/questions/2497215/extract-domain-name-from-url code.#!/usr/bin/perl -w use strict; my $url = $ARGV[0]; if($url =~ /([^:]*:\/\/)?([^\/]+\.[^\/]+)/g) { print $2; } code.. Usage: code. ./test.pl 'https://example.com' example.com ./test.pl 'https://www.example.com/' www.example.com ./test.pl 'example.org/' example.org ./test.pl 'example.org' example.org ./test.pl 'example' -> no output code.. _"And if you just want the domain and not the full host + domain use this instead:"_ code.#!/usr/bin/perl -w use strict; my $url = $ARGV[0]; if($url =~ /([^:]*:\/\/)?([^\/]*\.)*([^\/\.]+\.[^\/]+)/g) { print $3; } code.. br. http://stackoverflow.com/questions/15627892/perl-regex-to-get-the-root-domain-of-a-url code.$facebook = "www.facebook.com/xxxxxxxxxxx"; $facebook =~ s/www\.(.*\.com).*/$1/; # get what is between www. and .com print $facebook; code.. _"Returns"_ facebook.com _"You may also want to make this work for .net, .org, etc. Something like:"_ code.s/www\.(.*\.(?:net|org|com)).*/$1/; code.. br. http://www.perlmonks.org/?node_id=670802 br. http://www.willmaster.com/blog/perl/extracting-domain-name-from-url.php _"First, remove the http/https and possible www. from the front of the URL:"_ code.$url =~ s!^https?://(?:www\.)?!!i; code.. _"Then, strip off everything from the first "/" to the end of the URL (doing nothing if there is no "/"):"_ code.$url =~ s!/.*!!; code.. _"Last, in case the URL was http://example.com?stuff or http://example.com#stuff or http://example.com:80/whatever, also strip off everything from the first "?" or "#" or ":", if present:"_ code.$url =~ s/[\?\#\:].*//; code.. _"The value of $url is now the domain name by itself."_ #perl - #programming - #regex