Explore the filesystem with a PHP directory list class

Not a PHP directory list - an early phone book

Sounds simple right? Well, listing directory contents is a core task for most programming languages. But there are a few gotchas to bear in mind when creating a PHP directory list class. First of all you need to decide whether you need to recurse – that is, follow subdirectories. If so, you must be careful to avoid an infinite recursion. If you want to use a child class to extend the base functionality then you’ll need to provide a mechanism for that class to control listing behaviour. In this post, I’ll put a simple example together.

Old school or new school

When dealing with directories you can choose to use core PHP functions such as opendir and readdir or you can use the newer SPL tool DirectoryIterator. This latter approach is the cleaner way to go for several reasons:

  • DirectoryIterator internalises the steps involved in acquiring a directory resource.
  • DirectoryIterator extends Iterator – which means you can list the contents of a file like an array – with foreach
  • DirectoryIterator is object-oriented and provides easy access to common operations and tests you might want to perform on files and directories

Listing the contents of a directory with DirectoryIterator

So let’s get started. Generating a PHP directory list is as simple as instantiating a DirectoryIterator object and looping through it:

1
2
3
4
5
$dir = "/home/mattz";
$lister = new \DirectoryIterator($dir);
foreach ($lister as $item) {
    print "$item\n";
}

Why the backslash in front of DirectoryLister? Well best practice requires that we put most of our code under a namespace. We use the backslash to ensure that the SPL DirectoryLister class in the global namespace is instantiated and not a local class of the same name. Note also that I directly printed the $item variable even though it contained an instance of DirectoryIterator. I was able to do this because the DirectoryIterator object has a __toString() method which will return a string when invoked in a string context.

Compare the previous fragment with the old way of achieving the same thing:

1
2
3
4
5
6
7
$dir = "/home/mattz";
 
$dh = opendir($dir);
while (($item = readdir($dh)) !== false) {
    print "$item\n";
}
closedir($dh);

Much less clean and easy.

Testing for dot directories

This is essential if we are to go recursive because “.” represents the current directory and “..” represents the parent directory. These appear in all Unix directory listings. If my PHP directory list code were to follow these directories it would never stop recursing – and the script would soon blow up. DirectoryIterator provides a simple method to check this.

1
2
3
4
5
6
7
8
9
$dir = "/home/mattz";
$lister = new \DirectoryIterator($dir);
foreach ($lister as $item) {
    if ($lister->isDot()) {        print "ignoring the dot!\n";
        continue;
    }
    print "$item\n";
}

isDot() tests that the item is both a directory and one of “.” and “..”. Using the older functions, I must perform those tests for myself

1
2
3
4
5
6
7
8
9
10
11
12
13
 
$dir = "/home/mattz";
$dh = opendir($dir);
$s = DIRECTORY_SEPARATOR;
 while (($item = readdir($dh)) !== false) {
    if (is_dir("{$dir}{$s}{$item}") && ($item == "." || $item == "..")) {
        print "ignoring the dot!\n";
        continue;
    }
    print "$item\n";
}
closedir($dh);

Testing for symlinks

This is another issue that can cause problems when exploring a directory structure. If we follow symlinks (alias directories) we can find ourselves moving unexpectedly into new parts of the filesystem – leading to some unexpected or even dangerous results. So by default I’m going to turn off symlink following. As you might expect, DirectoryIterator has a handy method to achieve this:

1
2
3
4
5
6
7
8
9
10
11
12
$dir = "/home/mattz";
$lister = new \DirectoryIterator($dir);
foreach ($lister as $item) {
    if (
        $item->isDot() || 
        ($item->isDir() && $item->isLink())    ) {
        print "($item) ignoring the dot or the symlink dir!\n";
        continue;
    }
    print "$item\n";
}

We only want to ignore symlinks to directories so we employ two tests here: isDir() and isLink().

Going old school, I can use the is_link function.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
$dir = "/home/mattz";
$lister = new \DirectoryIterator($dir);
$dh = opendir($dir);
$s = DIRECTORY_SEPARATOR;
 
while (($item = readdir($dh)) !== false) {
    $path = "{$dir}{$s}{$item}";
    if (is_dir($path) &&
        (
            ($item == "." || $item == "..") ||
            is_link($path)        )
    ) {
        print "ignoring the dot or the symlink dir!\n";
        continue;
    }
    print "$item\n";
}
closedir($dh);

Putting it together – a PHP directory list class

I’m going to place my PHP directory list functionality into a class for two reasons. Firstly, a class will allow me to save state (if I want to compile a filtered list, for example). Secondly, a class will be easy to extend for future uses.

Here goes:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
class Lister
{
    public function listdir(\DirectoryIterator $iterator)
    {
        foreach ($iterator as $file) {
            if ($file->isDot() || ($file->isDir() && $file->isLink())) {
                continue;
            }
 
            if ($file->isDir()) {
                if ($this->handleDir($file)) {
                    $this->listdir(new DirectoryIterator($file->getPathName()));
                }
                continue;
            }
 
            if (! $this->handleFile($file)) {
                // no further iteration of this directory
                return;
            }
        }
    }
 
    protected function handleDir(\DirectoryIterator $it)
    {
        print "$it\n";
        return true;
    }
 
    protected function handleFile(\DirectoryIterator $it)
    {
        print "$it\n";
        return true;
    }
}

So the only new DirectoryIterator piece here is getPathName(). That returns the full path of the current file or directory. I use that to create a new DirectoryIterator object which I pass to the listdir() method all over again. In this way, when I encounter a directory that is not a symlink or one of . or .. I jump down and start the process all over again.

There is an exception to this. I only make this recursive call if a method named handleDir(), which I call first, returns true. Similarly, I call handleFile() for each file I encounter in the directory listing. If handleFile() does not return true. I abort the listing in the current directory. Since these methods are hardcoded to return true, this might seem redundant. I’ll show you how it can be made useful in the next section.

First, though, here’s the old school version of this code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
 
class Lister
{
    public function listdir($dir)
    {
        $dh = opendir($dir);
        $s = DIRECTORY_SEPARATOR;
 
        while (($item = readdir($dh)) !== false) {
            $path = "{$dir}{$s}{$item}";
            if (is_dir($path) &&
                (                    ($item == "." || $item == "..") ||
                    is_link($path)
                )
            ) {
                continue;
            }
 
            if (is_dir($path)) {
                if ($this->handleDir($path)) {
                    $this->listdir($path);
                }
            }
 
            if (! $this->handleFile($path)) {
                // no further iteration of this directory
                return;
            }
        }
        closedir($dh);
    }
 
    protected function handleDir($dir)
    {
        print "{$dir}\n";
        return true;
    }
 
    protected function handleFile($file)
    {
        print "{$file}\n";
        return true;
    }
}

As you can see, things are beginning to get clunky – and it will only get worse as you begin to work more with the files and paths. The SPL classes exist to encapsulate complexity – which usually means cleaner, more elegant code.

Using the lister: the A game

In common with most most object-oriented coders, I hate duplication. Duplicated code is inelegant and it can cause problems over time. With duplications in your system you have to remember to fix bugs and add features in every place the code block is repeated. Before you know it, parts of your system fall out of alignment with others, and things spin out of control like a blaster-clipped tie fighter. By creating a single parent class with common functionality, you define core functionality only once.

My PHP directory list class – Lister – is designed to be overridden in this way. By creating a child class and then overriding handleDir() and handleFile() you can do what you like with the directories and files the parent class traverses for you. Here, for example, is AllTheAs a simple PHP director list class that’s designed only to navigate and print items that begin with the letter ‘a’.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
class AllTheAs extends Lister
{
    protected function handleDir(\DirectoryIterator $it)
    {
        if (strpos($it->getFileName(), "a") === 0) {
            print "$it\n";
            return true;
        }
        return false;
    }
 
    protected function handleFile(\DirectoryIterator $it)
    {
        if (strpos($it->getFileName(), "a") === 0) {
            print "$it\n";
        }
        return true;
    }
}

Because handleDir() only returns true when the provided directory begins with the letter ‘a’, only such directories will be traversed. handleFile() always returns true because I don’t want to terminate listing within a directory – but it only outputs matching file names.

Here’s the code I use to call it:

1
2
$lister = new AllTheAs();
$lister->listdir(new \DirectoryIterator("/home/mattz/Dropbox"));

phone book 2
Wystan CC BY 2.0

Leave a Reply

Your email address will not be published. Required fields are marked *