Ayadi Tahar | Track down space in Linux

Track down space in Linux

Publish Date: 2023-06-10


It happens that sometimes you run out of space in your data drive disks, at that time you might have serious problems especially in production environment, and usually you will not notice that until you start losing some data and applications stop running.

In this article we will see how to track down where space is being used in linux environment, and to decide whether to add more storage disks or delete unwanted data to free up some space.

Where to start from ?

Generally you receive error from some applications or processes stop working in which case you get an indication where to start from, however in general we start by examining resources in disk partition by looking into mount points directories and minimize search area into only one mount point.

The df command

the first thing you would mostly to do is to check the disk free space using the df command :


df -h
Filesystem                      Size  Used Avail Use% Mounted on
devtmpfs                        4.9G     0  4.9G   0% /dev
tmpfs                           4.9G  168K  4.9G   1% /dev/shm
tmpfs                           4.9G  496M  4.4G  10% /run
tmpfs                           4.9G     0  4.9G   0% /sys/fs/cgroup
/dev/mapper/rhel-root           355G  146G  210G  41% /
none                            4.9G   68K  4.9G   1% /tmp
/dev/sda2                      1014M  353M  662M  35% /boot
/dev/sda1                       599M  5.8M  594M   1% /boot/efi
tmpfs                           993M  100K  993M   1% /run/user/1001

From above we see that the root (/) partition is near to it's half capacity as it consume about 146G , so let's start our investigation from there to find out.

The du Command

The du command is short for "disk usage" , which commonly used to report us about where the storage is used and it require the a directory as an argument to start from.


du /
0       /proc/2389695/task/2389719/attr
0       /proc/2389695/task/2389719
16      /etc/smartmontools
0       /etc/qemu-ga/fsfreeze-hook.d
    <<  ommited output >>
4       /etc/qemu-ga
8       /etc/nvme
35596   /etc
4       /root/.cache/dconf

as we can see and without options, the result of du command doesn't help us as it prints a lot of outputs. so instead we can limit the result to sort the output in descending order (from largest to smmallest ) and show only top 5, in GB size.


du -BG --max-depth=1 / | sort -n | tail -n 5
8G      /kni
35G     /home
96G     /var
193G    /mnt
338G    /
  1. –B option is used to specify the block size unit ( in Gigabyte in our case, MB for Megabyte)
  2. –max-depth=1: show results by the first level of subdirectories
  3. sort to sort the results, and the -n option to treat strings as numbers
  4. tail with the -n option to cuts the output up to the last five lines only

from the result above, the /mnt folder seems to have a large amount of data, but since this folder is a mount point to external shared files, it is not the case in which to cause the storage issue. the second-largest one is /var folder, so let's take a deep look into it by increasing the depth after each time using the du command:


du -BG --max-depth=1 /var | sort -n | tail -n 5
1G      /var/spool
1G      /var/tmp
1G      /var/www
95G     /var/lib
96G     /var

and increase depth 1 more step:


du -BG --max-depth=2 /var | sort -n | tail -n 5
1G      /var/www
1G      /var/www/html
95G     /var/lib
95G     /var/lib/containers
96G     /var

we can explicitly pass the location of specific folder instead of deep depth each time in which it give the same result:


du -BM --max-depth=1 /var/lib/containers | sort -n | tail -n 5
0G      /var/lib/containers/sigstore
1G      /var/lib/containers/cache
95G     /var/lib/containers
95G /var/lib/containers/storage

We may narrow down our search even further by combining depth with the target directory.


du -BG --max-depth=7 /var/lib/containers | sort -n | tail -n 5
81G     /var/lib/containers/storage/volumes/quay-storage
81G     /var/lib/containers/storage/volumes/quay-storage/_data
84G     /var/lib/containers/storage/volumes
95G     /var/lib/containers
95G     /var/lib/containers/storage

du -BG --max-depth=9 /var/lib/containers/storage | sort -n | tail -n 5
80G /var/lib/containers/storage/volumes/quay-storage/_data/sha256
81G     /var/lib/containers/storage/volumes/quay-storage
81G     /var/lib/containers/storage/volumes/quay-storage/_data
84G     /var/lib/containers/storage/volumes
95G     /var/lib/containers/storage

It appears that we have identified the primary consumer of our storage space, as indicated by the absence of any additional results from the "du" command. The subsequent output reveals the presence of other little files with identical sizes:


du -BG --max-depth=9 /var/lib/containers/storage/volumes/quay-storage/_data/sha256 | sort -n | tail -n 5
2G /var/lib/containers/storage/volumes/quay-storage/_data/sha256/f4
2G /var/lib/containers/storage/volumes/quay-storage/_data/sha256/f5
3G /var/lib/containers/storage/volumes/quay-storage/_data/sha256/6c
3G /var/lib/containers/storage/volumes/quay-storage/_data/sha256/c8
80G     /var/lib/containers/storage/volumes/quay-storage/_data/sha256

du -BG --max-depth=9 /var/lib/containers/storage/volumes/quay-storage/_data/sha256/f4 | sort -n | tail -n 5
2G      /var/lib/containers/storage/volumes/quay-storage/_data/sha256/f4

When we examine that folder, we see that it houses the containers utilised by the quay repository:


ls -l /var/lib/containers/storage/volumes/quay-storage/_data/sha256
total 1020
drwxr-xr-x. 2 kni root 4096 نوفمبر  1 00:19 00
drwxr-xr-x. 2 kni root 4096 نوفمبر  1 00:39 01
drwxr-xr-x. 2 kni root 4096 نوفمبر  1 00:36 02
drwxr-xr-x. 2 kni root 4096 نوفمبر  1 00:22 03
drwxr-xr-x. 2 kni root 4096 أكتوبر 31 23:44 04
    [ommited output ]

The find Command

another usefully and handy command is find command, in which can be used to find files in linux.

here a basic example of using it to find a file named file1.text in current directory (.):


    find . -name file1.txt

but if you want to look into a file(s) in specific directory, you can replace dot (.) by the path, and in our example here we are looking for all files that end by .jpg in the /home and directories below it:


  find /home -name *.jpg

we can also use the find command to find and remove big files as quick fix:, for example if we want to remove files larger than 1GB .

We accomplish this by using the -size and -printf options. The second choice displays the file path and size in bytes in the following example. Similar to above, we sort the result by file size using the sort -n command:


find /var -size +100M -printf '%s %p\n' | sort -n
2725803587 /var/lib/libvirt/qemu/save/esxi03.save
12225464589 /var/lib/libvirt/qemu/save/esxi01.save
21478375424 /var/lib/libvirt/images/esxi01.qcow2
21478375424 /var/lib/libvirt/images/esxi03-2.qcow2
21478375424 /var/lib/libvirt/images/esxi03.qcow2
26847870976 /var/lib/libvirt/images/esxi02.qcow2
107390828544 /var/lib/libvirt/images/esxi01-1.qcow2

The lsof command

As the saying goes, "everything in Linux is a file." One useful tool for managing disk space is the lsof (list open files) command, which displays all the open files that processes are using.

While it is common practice for these files to be deleted when a process exits, there are instances when this is not the case, for example when a user delete an application log file while it's running. This does not delete the file, and the application keeps writing to it. However, ls and file managers no longer show the file.

the lsof command is required to find out, as it displays process file descriptors and deleted files. An actual command might be:


lsof | grep -E '^COM|deleted'
COMMAND   PID   TID   USER   FD    TYPE       DEVICE    SIZE/OFF     NODE NAME
Isolated   9742 10044 RemoteLzy            ahmed   54r      REG                0,1     20820       4196 /memfd:mozilla-ipc (deleted)
unattende  1149  1236 gmain                 root    3w      REG              259,5       113    8783429 /var/log/unattended-upgrades/unattended-upgrades-shutdown.log.1 (deleted)
java      10224 19174 DefaultDi            ahmed  118r      REG              259,5     32768   10514221 /home/ahmed/.local/share/gvfs-metadata/root-c62e7316.log (deleted)

as we see in our output above, some open process consume spaces, we either need to close them or restart the programs related to them, in which our case mozilla, restart a system, and another java software (text editor).

Сonclusion

In this article, we saw some approaches to examine the filesystem in Linux to find disk space usage, which might be located swiftly or precisely, depending on the programs installed on the system.