You're viewing old version number 1. - Current version
Perl regex extracting domain name from URL
use strict;
use warnings;
use URI::Split qw/ uri_split uri_join /;
my $scheme_host = do {
my (@parts) = uri_split 'http://linux.pacific.net.au/primary.xml.gz';
uri_join @parts[0,1];
};
print $scheme_host;
If cannot install module:
use strict;
use warnings;
my $url = 'http://linux.pacific.net.au/primary.xml.gz';
my ($scheme_host) = $url =~ m|^( .*?\. [^/]+ )|x;
print $scheme_host;
outputs: http://linux.pacific.net.au
http://stackoverflow.com/questions/2497215/extract-domain-name-from-url
#!/usr/bin/perl -w
use strict;
my $url = $ARGV[0];
if($url =~ /([^:]*:\/\/)?([^\/]+\.[^\/]+)/g) {
print $2;
}
Usage:
./test.pl 'https://example.com'
example.com
./test.pl 'https://www.example.com/'
<a href="http://www.example.com">www.example.com</a>
./test.pl 'example.org/'
example.org
./test.pl 'example.org'
example.org
./test.pl 'example' -> no output
"And if you just want the domain and not the full host + domain use this instead:"
#!/usr/bin/perl -w
use strict;
my $url = $ARGV[0];
if($url =~ /([^:]*:\/\/)?([^\/]*\.)*([^\/\.]+\.[^\/]+)/g) {
print $3;
}
http://stackoverflow.com/questions/15627892/perl-regex-to-get-the-root-domain-of-a-url
$facebook = "www.facebook.com/xxxxxxxxxxx";
$facebook =~ s/www\.(.*\.com).*/$1/; # get what is between www. and .com
print $facebook;
"Returns"
facebook.com
"You may also want to make this work for .net, .org, etc. Something like:"
s/www\.(.*\.(?:net|org|com)).*/$1/;
#perl - #programming - #regex
From JR's : articles
157 words - 1619 chars
created on
- #
source
- versions
Related articles
Perl regex extracting domain name from URL - Oct 02, 2013
Perl regex extracting domain name from URL code example - Oct 02, 2013
Veery Blog App - April 2015 - Aug 03, 2015
Perl Programming Environments - May 06, 2013
Run perl script as a daemon - Nov 14, 2014
more >>