Getting the Files Being Used by a Process on Mac OS X

Fans of the Mac OS X Activity Monitor might have noticed that it allows you to inspect a process and see which files it has open. Being able to get this information programmatically is useful because it lets you do things like:

  • Provide better error messages: instead of saying “File (x) is in use” you can say “File (x) is in use by (some program)”.
  • Infer things about what a program is doing: if Textedit has somedocument.rtf open then you know the user is probably
    editing somedocument.rtf without having to resort to using the accessibility API.
  • Detect suspicious behavior: If a process has a bunch of system files open or a bunch of sockets open that might indicate
    it’s up to no good.

This post will walk you through writing get_process_handles which implements a small subset of the functionality in lsof. You can fork/clone the code at https://github.com/palominolabs/get_process_handles and follow along.

The first step is to make the skeleton app. It does nothing but display the PID requested by the user on the command line and print an error message if the user didn’t supply a valid PID.

Once that’s working we come to the tasty meat of the problem. The key method we’ll use is proc_pidinfo. Its prototype is

int proc_pidinfo(int pid, int flavor, uint64_t arg, void *buffer, int buffersize)


To use it we need to call it twice. The first time we pass NULL for buffer and buffersize and it will return the size of the buffer needed to hold all the information records about the given PID. We use that information to allocate a buffer of the appropriate size. Buffer in hand we can call proc_pidinfo again this time providing our buffer. It will populate the buffer with a series of proc_fdinfo instances. The code looks something like this:

int bufferSize = proc_pidinfo(pid, PROC_PIDLISTFDS, 0, 0, 0);
struct proc_fdinfo *procFDInfo = (struct proc_fdinfo *)malloc(bufferSize);

Then, since we know the size of the buffer and the size of each entry we can iterate over each entry in the buffer:

int numberOfProcFDs = bufferSize / PROC_PIDLISTFD_SIZE;

int i;
for(i = 0; i < numberOfProcFDs; i++) {
    // Do stuff with procFDInfo[i]
}

This is oodles of fun already, but we really want to display some useful information about each handle. Each proc_fdinfo has a field indicating the type of FD (socket, file, etc.) and the actual file handle. If we pass the type and the handle to the proc_pidfdinfo function it can give use details about the FD such as the socket or file path. Here’s how to get the path for a file handle:

if(procFDInfo[i].proc_fdtype == PROX_FDTYPE_VNODE) {
    struct vnode_fdinfowithpath vnodeInfo;
    proc_pidfdinfo(pid, procFDInfo[i].proc_fd, PROC_PIDFDVNODEPATHINFO, &vnodeInfo, PROC_PIDFDVNODEPATHINFO_SIZE);
    // File path is in vnodeInfo.pvip.vip_path
}

The code can be easily extended to display the open TCP sockets, too. It goes like so:

if(procFDInfo[i].proc_fdtype == PROX_FDTYPE_SOCKET) {
    proc_pidfdinfo(pid, procFDInfo[i].proc_fd, PROC_PIDFDSOCKETINFO, &socketInfo, PROC_PIDFDSOCKETINFO_SIZE);
    if(socketInfo.psi.soi_family == AF_INET && socketInfo.psi.soi_kind == SOCKINFO_TCP) {
        int localPort = (int)ntohs(socketInfo.psi.soi_proto.pri_tcp.tcpsi_ini.insi_lport);
        int remotePort = (int)ntohs(socketInfo.psi.soi_proto.pri_tcp.tcpsi_ini.insi_fport);
        if (remotePort == 0) {
            // Listening on localPort
        } else {
            // Communicating on localPort and remotePort
        }
    }
}

At this point our little utility can display open files and TCP sockets. You can clone/fork to your heart’s content at https://github.com/palominolabs/get_process_handles/. As a fun exercise you can extend it to support UDP sockets, open Appletalk ports, or installed FS event listeners by handling different values of proc_fdtype.

Posted by Manuel Wudka-Robles

Manuel is a sponge for software knowledge. Manuel’s software development expertise ranges from Rails web development to obscure 3rd party APIs and long-forgotten web properties. Manuel was the “API guy” at Genius.com, recognized for his deep knowledge of how to build and scale APIs. At Turn, Manuel focused on improving the exciting world of display advertising. Most recently at Tello, Manuel led the integration with Twilio and KISSMetrics. At Palomino Labs, Manuel also serves as Director of IE Compatibility.

About Palomino Labs

Palomino Labs unlocks the potential of software to change people and industries. Our team of experienced software developers, designers, and product strategists can help turn any idea into reality.

See the Palomino Labs website for more information, or send us an email and let's start talking about how we can work together.