You don't need to be an 'investor' to invest in Singletrack: 6 days left: 95% of target - Find out more
I want to achieve the following:
1)Scan a range of IP addresses
2)For any IP address that has port 80 open, record some useful information about the website such as the first 50 characters on the first page and certain meta data.
3) log the result so that a specific web site can be found.
I'm quite happy working with Linux and nmap at a command line but need some suggestions about looping through multiple web sites recording the mentioned information.
Thanks.
Looks like Lynx combined with some grep will work.
i can't think of an easy way to do that, or any reason why you would want to. But just beware that should the host's be using vhosts then you might just get the default apache/iis/lighthttpd..... page when doing a GET to port 80 of the ip.
ping IP, if responsive, telnet to port 80, use curl or wget to pull the content?
What about a simple for-each loop?
for each x in n ; do < what ever your'e doing to extract the data>
Internal IP addresses or external/Internet ones?
Python has some libraries that you could use for that, but the only script I have that comes close is one that parses nmap XML output from a ping sweep.
that's pretty simple to do.
How are you generating the list of IP addresses? Are they sequential, a list in a file or on the command line?
the function would be something like:
function getheader {
if [ $# != 1 ]; then
return 1
fi
host $1 &>/dev/null
if [ $? != 0 ]; then
return 1
fi
nmap -p 80 $1 2>&1 | grep "80/tcp open" &>/dev/null
if [ $? = 0 ]; then
curl -q $1 2>/dev/null | tr -d '\n' | cut -c 1-50 > $1.txt
else
return 1
fi
return 0
}
then just write a loop calling getheader <ip>
That should work; it could be cleaner as that will spawn a lot of processes so you could do the tr | cut bit in awk; however that would take longer than 2 mins 🙂 As I would need to read the man page
or do you need help with the loop part?
I wouldn't bother using nmap. Just let curl try, and use a short --connect-timeout to fail quickly in case the target is dropping rather than refusing connections.
for ip in 'cat ips.txt' ; do echo $ip ; curl http://$ip/ --connect-timeout 3 | head -c 50 ; echo ;done > log.txt
EDIT you need backticks around "cat ips.txt" but the forum software can't cope with that...
purpleyeti makes a good point - connecting via a URL with an IP in it may not get you the expected result due to virtual hosting.
curl FTW
If you are going to use:
for ip in 'cat ips.txt'
You may want to first set IFS=$'\n'
You may want to first set IFS=$'\n'
Why? Default for IFS already contains newline.
by settin gthe IFS you ensure that lines in a file are kept intact when using $(cat $file); not that important in this case but still a useful thing to set to ensure that each line = one variable.