Life as a Web developer can be hard when things start going wrong. The problem could be in any number of places. Is there a problem with the request you’re sending, is the problem with the response, is there a problem with a request in a third party library you’re using, is an external API failing?
Good tools are invaluable in figuring out where problems lie, and can also help to prevent problems from occurring in the first place, or just help you to be more efficient in general. Command line tools are particularly useful because they lend themselves well to automation and scripting, where they can be combined and reused in all sorts of different ways. Here we cover six particularly powerful and versatile tools which can help make your life a little bit easier.
(Image credit: kolnikcollection)
Curl
Curl is a network transfer tool that’s very similar to Wget, the main difference being that by default Wget saves to a file, and curl outputs to the command line. This makes it really simple to see the contents of a website. Here, for example, we can get our current IP from the ifconfig.me website:
$ curl ifconfig.me
93.96.141.93
Curl’s -i
(show headers) and -I
(show only headers) options make it a great tool for debugging HTTP responses and finding out exactly what a server is sending to you:
$ curl -I news.ycombinator.com HTTP/1.1 200 OK Content-Type: text/html; charset=utf-8 Cache-Control: private Connection: close
The -L
option is handy, and makes curl
automatically follow redirects. Curl has support for HTTP Basic authentication, cookies, manually setting headers and much, much more.
Ngrep
For serious network packet analysis there’s Wireshark, with its thousands of settings, filters and configuration options. There’s also a command-line version, TShark. For simple tasks I find Wireshark can be overkill, so unless I need something more powerful, ngrep is my tool of choice. It allows you to do with network packets what grep does with files.
For Web traffic you almost always want the -W
byline option, which preserves linebreaks, and -q
is a useful argument which suppresses some additional output about non-matching packets. Here’s an example that captures all packets that contain GET
or POST
:
ngrep -q -W byline "^(GET|POST) .*"
You can also pass in additional packet filter options, such as limiting the matched packets to a certain host, IP or port. Here we filter all traffic going to or coming from Google, using port 80 and containing the term “search.”
ngrep -q -W byline "search" host www.google.com and port 80
Netcat
Netcat, or nc
, is a self-described networking Swiss Army knife. It’s a very simple but also very powerful and versatile application that allows you to create arbitrary network connections. Here we see it being used as a port scanner:
$ nc -z example.com 20-100 Connection to example.com 22 port [tcp/ssh] succeeded! Connection to example.com 80 port [tcp/http] succeeded!
In addition to creating arbitrary connections, Netcat can also listen for incoming connections. Here we use this feature of nc
, combined with tar
, to very quickly and efficiently copy files between servers. On the server, run:
$ nc -l 9090 | tar -xzf -
And on the client:
$ tar -czf dir/ | nc server 9090
We can use Netcat to expose any application over the network. Here we expose a shell over port 8080:
$ mkfifo backpipe $ nc -l 8080 0<backpipe | /bin/bash > backpipe
We can now access the server from any client:
$ nc example.com 8080 uname -a Linux li228-162 2.6.39.1-linode34 ##1 SMP Tue Jun 21 10:29:24 EDT 2011 i686 GNU/Linux
While the last two examples are slightly contrived (in reality you’d be more likely to use tools such as rsync to copy files and SSH to remotely access a server), they do show the power and flexibility of Netcat, and hint at all of the different things you can achieve by combining Netcat with other applications.
Sshuttle
Sshuttle allows you to securely tunnel your traffic via any server you have SSH access to. It’s extremely easy to set up and use, not requiring you to install any software on the server or change any local proxy settings.
By tunneling your traffic over SSH, you secure yourself against tools like Firesheep and dsniff when you’re on unsecured public Wi-Fi or other untrusted networks. All network communication, including DNS requests, can be sent via your SSH server:
$ sshuttle -r <server> --dns 0/0
If you provide the --daemon
argument, sshuttle will run in the background as a daemon. Combined with some other options, you can make aliases to simply and quickly start and stop tunneling your traffic:
alias tunnel='sshuttle --D --pidfile=/tmp/sshuttle.pid -r <server> --dns 0/0' alias stoptunnel='[[ -f /tmp/sshuttle.pid ]] && kill `cat /tmp/sshuttle.pid`'
You can also use sshuttle to get around the IP-based geolocation filters that are now used by many services, such as BBC’s iPlayer, which requires you to be in the UK, and Turntable, which requires you to be in the US. To do this, you’ll need access to a server in the target country. Amazon has a free tier of EC2 Micro instances that are available in many countries, or you can find a cheap virtual private server (VPS) in almost any country in the world.
In this scenario, rather than tunneling all of our traffic, we might want to send only traffic for the service we are targeting. Unfortunately, sshuttle only accepts IP address arguments and not hostnames, so we need to make use of dig
to first resolve the hostname:
$ sshuttle -r <server> `dig +short <hostname>`
Siege
Siege is a HTTP benchmarking tool. In addition to load-testing features, it has a handy -g
option that is very similar to curl’s -iL
, except it also shows you the request headers. Here’s an example with Google (I’ve removed some headers for brevity):
$ siege -g www.google.com GET / HTTP/1.1 Host: www.google.com User-Agent: JoeDog/1.00 [en] (X11; I; Siege 2.70) Connection: close HTTP/1.1 302 Found Location: http://www.google.co.uk/ Content-Type: text/html; charset=UTF-8 Server: gws Content-Length: 221 Connection: close GET / HTTP/1.1 Host: www.google.co.uk User-Agent: JoeDog/1.00 [en] (X11; I; Siege 2.70) Connection: close HTTP/1.1 200 OK Content-Type: text/html; charset=ISO-8859-1 X-XSS-Protection: 1; mode=block Connection: close
What Siege is really great at is server load testing. Just like ab (an Apache HTTP server benchmarking tool), you can send a number of concurrent requests to a site, and see how it handles the traffic. With the following command, we’ll test Google with 20 concurrent connections for 30 seconds, and then get a nice report at the end:
$ siege -c20 www.google.co.uk -b -t30s ... Lifting the server siege... done. Transactions: 1400 hits Availability: 100.00 % Elapsed time: 29.22 secs Data transferred: 13.32 MB Response time: 0.41 secs Transaction rate: 47.91 trans/sec Throughput: 0.46 MB/sec Concurrency: 19.53 Successful transactions: 1400 Failed transactions: 0 Longest transaction: 4.08 Shortest transaction: 0.08
One of the most useful features of Siege is that it can take a file of URLs as an input, and then hit those URLs rather than just a single page. This is great for load testing, because you can replay real traffic against your site and see how it performs, rather than just hitting the same URL again and again. Here’s how you would use Siege to replay your Apache logs against another server to load test it:
$ cut -d ' ' -f7 /var/log/apache2/access.log > urls.txt $ siege -c<concurrency rate> -b -f urls.txt
Mitmproxy
Mitmproxy is an SSL-capable, man-in-the-middle HTTP proxy that allows you to inspect both HTTP and HTTPS traffic, and rewrite requests on the fly. The application has been behind quite a few iOS application privacy scandals, including Path’s address book upload scandal. Its ability to rewrite requests on the fly has also been used to target iOS, including setting a fake high score in GameCenter.
Far from only being useful to see what mobile applications are sending over the wire or for faking high scores, mitmproxy can help out with a whole range of Web development tasks. For example, instead of constantly hitting F5 or clearing your cache to make sure you’re seeing the latest content, you can run
$ mitmproxy --anticache
which will automatically strip all cache-control headers and make sure you always get fresh content. Unfortunately it doesn’t automatically set up forwarding for you like sshuttle does, so after starting mitmproxy you still need to change your system-wide or browser-specific proxy settings.
Another extremely handy feature of mitmproxy is the ability to record and replay HTTP interactions. The official documentation gives an example of a wireless network login. The same technique can be used as a basic Web testing framework. For example, to confirm that your user signup flow works, you can start recording the session:
$ mitmdump -w user-signup
Then go through the user signup process, which at this point should work as expected. Stop recording the session with Ctrl
+ C
. At any point we can then replay what was recorded and check for the 200 status code:
$ mitmdump -c user-signup | tail -n1 | grep 200 && echo "OK" || echo "FAIL"
If the signup flow gets broken at any point, we’ll see a FAIL
message, rather than an OK
. You could create a whole suite of these tests and run them regularly to make sure you get notified if you ever accidentally break anything on your site.
(cp)
© Ben Dowling for Smashing Magazine, 2012.