Writing Your Own Dev Tools: What’s Keeping My Ports Open When I Need Them?

Introduction

Let's get straight to the point. On many different occasions, but mostly when I intensely crash and run my scripts, I can't bring up anything on the port that I want to. This is usually because the previous iteration crashed and is still occupying the port but not responding. Of course, there are many ways to get the pid of the process and just kill -9 it, but I always forget if it is netstat -nlp, lsof -n -i, ss -tanp, fuser -v -n... Well, I think you can already see where my problem lies. I just don't remember these magic flags and which set of tools I can use on my current machine and OS. And to add to this, the problem can change with time, and the tool that worked for me a few years ago now just stops working for any reason. I don't really want to dig deeper; I just want an answer to my simple question: what's occupying the port I want to use?

Simplifying Port Listing

Out of all the programming languages I know (which admittedly isn't many), I couldn't find a suitable library to solve my issue. I didn't want to resort to writing a bash script, as I'm not a fan of writing anything longer than a simple one-liner in bash. I also couldn't find any npm libraries that met my needs.

However, one day I discovered the brilliant psutil, which was the solution to all my current and future problems. It met all my criteria: it's easy to use, verbose, and multi-platform. Let's dive into it!

For the installation, I use venv and pip. It's quite straightforward, so I'll skip this part. Therefore, I already have everything necessary installed and ready to go. The first problem I want to solve is to list all the open ports on my machine. Let's do it.

import psutil

connections = []
for c in psutil.net_connections():
    d = c._asdict()
    connections.append(d)

print(connections)

Not bad. I use _asdict to create a dict from the named tuple. I couldn't find a better way, so I'll leave it as is. The script works and gives me the answer, but the output isn't very readable. We can do better. Let's introduce pandas. I agree that there may be a simpler solution, but pandas is great at table formatting and parsing. I know this for sure, so I'll stick with it and not bother looking for alternatives.

df = pd.DataFrame(connections)

print(df)

Nice! But let's not stop there. We can do much more by adding domain resolving. Thankfully, we don't need any extra libraries to achieve this in Python. We can use socket.getnameinfo, which works like a charm and is available out of the box.

d = c._asdict()

d['laddr'], d['lport'] = d['laddr']
lname = socket.getnameinfo((d['laddr'], d['lport']), 0)
d['laddr'], _ = lname

try:
    d['raddr'], d['rport'] = d['raddr'] 
    rname = socket.getnameinfo((d['raddr'], d['rport']), 0)
    d['rhost'], d['proto'] = rname
except:
       # sometimes there's no raddr, but that's not an issue
    d['raddr'] = np.nan
    d['rport'] = np.nan

That looks good to me. However, psutil provides much more functionality than what I need. For my purposes, I only require the process name, as shown below.

p = psutil.Process(d['pid'])
d['pname'] = p.name()

Query and Filter

Thank to pandas I don’t need to grep, cut and tr. I don’t think that these tools are bad. I just use those tools so rarely that I often forget which one to use when and what are the arguments and I always start from --help and make many mistakes along the way. But, as I mentioned, we can utilize pandas' query for and loc for this purpose.


import argparse
import pandas as pd

parser = argparse.ArgumentParser()

parser.add_argument('-q', '--query')
parser.add_argument('-f', '--filter')
parser.add_argument('-s', '--sort')

args = parser.parse_args()

connections = []
...
df = pd.DataFrame(connections)

if args.query:
    df = df.query(args.query)

if args.filter:
    df = df.loc[:, args.filter.split(',')]

if args.sort:
    df = df.sort_values(args.sort.split(','))

print(...)

Final Thoughts

The solution is pretty simple, yet still quite effective. However, there is one more thing I would like to mention. Sometimes, the output of print(df) may not be what you want to see. Therefore, it is worth exploring the arguments that to_string accepts. Below, you can find the complete script that I use. I hope you find it helpful. Happy coding!

import socket
import argparse

import psutil
import pandas as pd
import numpy as np

parser = argparse.ArgumentParser()

parser.add_argument('-q', '--query')
parser.add_argument('-f', '--filter')
parser.add_argument('-s', '--sort')

args = parser.parse_args()

connections = []
for c in psutil.net_connections():
    try:
        d = c._asdict()

        d['laddr'], d['lport'] = d['laddr']
        lname = socket.getnameinfo((d['laddr'], d['lport']), 0)
        d['laddr'], _ = lname

        try:
            d['raddr'], d['rport'] = d['raddr']
            rname = socket.getnameinfo((d['raddr'], d['rport']), 0)
            d['rhost'], d['proto'] = rname
        except:
            d['raddr'] = np.nan
            d['rport'] = np.nan

        p = psutil.Process(d['pid'])
        d['pname'] = p.name()

        connections.append(d)
    except Exception as e:
        pass

df = pd.DataFrame(connections)

if args.query:
    df = df.query(args.query)

if args.filter:
    df = df.loc[:, args.filter.split(',')]

if args.sort:
    df = df.sort_values(args.sort.split(','))

print(
    df.to_string(
        index=False,
        max_rows=None,
        max_cols=None,
        na_rep='-',
        max_colwidth=None,
        float_format=lambda x: "{:.0f}".format(x)
    )
)