What kind of maintenance does one need to do on an Ubuntu / Debian / Raspberry Pi Linux distributions? Defrag the drive, clean your registry, update antivirus, etc. just like you need to in Windows? None of this is needed in Linux but there is some recommend filesystem hygiene you should be doing.
Gather Information
For starters, lets gather some information. Some of the maintenance activities listed here, you're going to need some basic information. This next section shows you how to get that information.
Distribution Name and Version
To determine which Linux version / build / release / distribution you are running:
# kernel name and release uname -sr # kernel version uname -v # Linux distribution cat /etc/*-release lsb_release -a
Use one of the following command to get a listing of all the physical hard drives (and virutal drives) on the system:
# terminal commandline utilities to list disk drives hwinfo --disk --short sudo lshw -class disk -class storage -short # GTK+ graphical user interface version of lshw sudo lshw-gtk &
For the Raspberry Pi, you can list the installed firmware version via:
# list the Raspberry Pi's firmware version
/opt/vc/bin/vcgencmd version
To get a broader, and potentially more detailed look at your system,
consider the commands lshw
and the graphical tools lshw-gtk
and sysinfo
.
OS and Application Maintenance
You should periodically update your Linux operating system (OS) and its applications.
Install Operating System and Application Patches/Updates
This will patch the Linux operating system and all its GPL applications
# commandline utility for applications upgrade sudo apt-get update; sudo apt-get dist-upgrade # graphics utility for applications upgrade update-manager -c
Updating Firmware for Raspberry Pi
In the case of the Raspberry Pi (RPi), you will want to also upgrade the firmware regularly.
Raspbian is the standard Linux operating system distribution for the RPi,
but it doesn't include firmware.
Never the less, tools for updating the firmware are included in the Raspbian distribution of Linux.
That tool is sudo apt-get install rpi-update
.
I'm using the Adafruit's Occidentalis distribution, so this requires a
slightly different update tool (git
needs to be installed):
# install tools to upgrade Raspberry Pi's firmware
sudo wget https://raw.github.com/Hexxeh/rpi-update/master/rpi-update -O /usr/bin/rpi-update
sudo chmod +x /usr/bin/rpi-update
Once the tool has been installed, periodically you can update the firmware via these commands:
# check for and install any required Raspberry Pi firmware upgrades sudo BRANCH=next rpi-update sudo reboot
Note that once the firmware has been successfully updated, you'll need to reboot to load the new firmware.
Filesystem Maintenance
Filesystems and disks should be check to make sure they are not running low on resources and are not showing any signs of pending failure.
Check Storage and Inode Usage
If you let some directories get really full, like above 95% full, you will see some serious system problems. Check on the status of directory systems storage space and inode usage:
df df -i
Disk and Filesystems Integrity
Smartmontools is a set of applications that can test hard drives
and read their hardware SMART statistics (install with sudo apt-get install smartmontools
).
To ensure that your drive supports SMART, type the following for each physical drive:
sudo smartctl -i /dev/sda
If smartctl
can access the drive, you should turn on some SMART features.
I ran the following command on my three drives (example for the /dev/sda
drive):
sudo smartctl -s on -o on -S on /dev/sda
Check the disk's overall health:
sudo smartctl -H /dev/sda
If it doesn’t return PASSED, you should immediately backup all your data. A short, but more extensive test is
sudo smartctl -t short /dev/sda
You can do a long self-test, but this can take a significant amount of time. You might want to run it overnight and check for the results in the morning.
sudo smartctl -t long /dev/sda
To check results, run the following:
sudo smartctl -l selftest /dev/sda
Unfortunately, there’s no way to check progress, so just keep running that command until the results show up. If either test fails, you should immediately backup all your data. Depending on the error, your drive might be close to death or it may still have a long life ahead. Consult the smartmontools FAQ.
We’ve now enabled some features and run the basic tests.
Instead of repeating the previous section daily, we can setup smartd
to do it all automatically.
Via the demon, you can run Smartmontools in the background,
have it check drives, and email you when there are issues.
See the Sources below to figure out what needs to be done to setup the smartd
demon.
Filesystem Checks and Repair
The Linux filesystem can be damaged under various circumstances, e.g., system crash,
power loss, disconnected disk, accidentally overwritten i-node, etc.
Thus it is a good idea to check the integrity of the filesystem regularly
to minimize the risk of filesystem corruption.
The tool used to do filesystem checks and repairs is fsck
.
You can check out the article to understand
"How to force fsck to check filesystem after system reboot on Linux"
Typically, you should run fsck
periodically,
but there are scenarios when you will immediately want to run fsck
.
Typical examples are:
- The system fails to boot
- Files on the system become corrupt (often you may see input/output error)
- Attached drive (including flash drives/SD cards) is not working as expected
NOTE: A filesystem check can run for many minutes, if not hours, depending on the size of the filesystem. You'll want to schedule the check for an appropriate time.
Using the utility tune2fs
we can get information
related to filesystem health and status.
The example below for the filesystem /dev/sda
,
you get the:
- number of times the filesystem has been mounted since the last
fsck
checkup, - number of mounts before a filesystem check is forced to happen,
- and the date/time of the last
fsck
check:
# what is the status of the /dev/sda filesystem $ sudo tune2fs -l /dev/md0 | grep -e "Last checked" -e Max -e Mount Mount count: 490 Maximum mount count: -1 Last checked: Mon May 27 14:57:13 2013
Generally, your going to want fsck
to automatically attempt to correct errors it finds.
This can be done with sudo umount /dev/sdb && sudo fsck -y /dev/sdb
.
The -y
flag, automatically “yes” to any prompts from fsck
to correct an error.
Forced Root Filesystem Check
You can do force a one-time filesystem check on the root file system
(aka /
) on the next reboot by doing the following:
sudo touch /forcefsck
Once you create an empty file named forcefsck
in the root directory,
it will force filesystem check the next time you boot up (only on the root filesystem).
After successful booting, /forcefsck
will automatically be removed.
Forced Non-Root Filesystem Check
Unlike the root filesystem,
there is no equivalent to /forcefsck
file for non-root filesystems.
The only way to force fsck
on all other non-root partitions
is to manipulate the filesystem's "Maximum mount count" parameter
within the /etc/fstab
configuration file.
To force filesystem fsck
check on non-root partition's to every 10th mount:
# force a filesystem check with fsck every 10 mount sudo tune2fs -c 10 /dev/sdb # to disable this automatic check sudo tune2fs -c 0 /dev/sdb # or sudo tune2fs -c -1 /dev/sdb
Periodic Filesystem Hygiene
Linux will leave some clutter around in the filesystem.
While generally not a problem, it can eat-up disk space,
and can become a problem for the /boot
directory.
Clean-Up Temporary Files
Some editors (like vim) may leave files ending with a ‘~' character laying around.
You can clean them up under your $HOME
as a normal user
(You can do it for the entire system as root, but that can be extremely dangerous.)
Use the command below to get a list of candidates:
find $HOME -type f -name "*~" -print
After that appears to do what you want, add the -exec part.
find $HOME -type f -name "*~" -print -exec rm {} \;
Kernel crashes, when they happen, write the core dump files under /var
.
Assuming you aren't saving them for debugging, you can do this to get a listing:
sudo find /var -type f -name "core" -print
Some applications create temporary files in their own directories:
rm -rf ${HOME}/.macromedia/* ${HOME}/.adobe/*
Clean-Up Old Log Files
You can also remove old compress log files from the system with
sudo rm -v /var/log/*.gz
Clean-Up Installation Packages
To remove partial packages, clean the cache, remove unused dependencies use:
# For software packages remove partial packages, clean the cache, remove unused dependencies
sudo apt-get autoclean
sudo apt-get clean
sudo apt-get autoremove
Clean-Up Old Kernel Packages
You also need to do something similar for kernel installations.
You'll find the amount of space being used by the current kernel
and old kernel installation packages in /boot
by running:
df -h /boot
The current Linux kernel installation (and the one you most definitely must keep) can be identified via:
uname -r
Run the following command to list all packages that you no longer need:
# list all the kernel packages installed on the system dpkg -l 'linux-*' # print the currently active kernel version uname -r | sed "s/\(.*\)-\([^0-9]\+\)/\1/" # list all the currently loaded old kernel packages, that is other than the active one dpkg -l 'linux-*' | sed '/^ii/!d;/'"$(uname -r | sed "s/\(.*\)-\([^0-9]\+\)/\1/")"'/d;s/^[^ ]* [^ ]* \([^ ]*\).*/\1/;/[0-9]/!d'
You can use the above command to permanently delete ALL older kernels:
# USE WITH CAUTION: perminately delete old kernel packages sudo apt-get remove --purge $(dpkg -l 'linux-*' | sed '/^ii/!d;/'"$(uname -r | sed "s/\(.*\)-\([^0-9]\+\)/\1/")"'/d;s/^[^ ]* [^ ]* \([^ ]*\).*/\1/;/[0-9]/!d')
However this may not be wise, as you should always have an old kernel or two to fall back to (just in case the new one doesn't work with your system). At the very least, if you've just upgraded the kernel, reboot before deleting the older versions.
# USE WITH CAUTION: to remove a specific kernel package, in this case 3.13.0-49 sudo apt-get remove --purge $(dpkg -l 'linux-*' | sed '/^ii/!d;/'"$(uname -r | sed "s/\(.*\)-\([^0-9]\+\)/\1/")"'/d;s/^[^ ]* [^ ]* \([^ ]*\).*/\1/;/[0-9]/!d' | grep 3.13.0-49)
And if you happen to blow away all the kernel images (as I have done more than once),
get your current kernel version back by executing uname -r
and then reinstall it with:
sudo apt-get install linux-image-x.x.x-xx
where x.x.x-xx
is the kernel version number give by the uname -r
command.
SSD TRIM
Solid-state drives (SSD) have brought about a new way of managing storage. SSDs have benefits like silent and cooler operation and a faster interface spec, compared to hard drives but brings with it new methods of maintenance and management. SSDs have a feature called TRIM. This is essentially a method for reclaiming unused blocks on the device, which may have been previously written, but no longer contain valid data and therefore, can be returned to the general storage pool for reuse.
Ubuntu Linux executes TRIM services via systemd timer service.
To check the existence and current status,
run sudo systemctl list-timers --all
:
# verify TRIM services are enabled $ sudo systemctl list-timers --all NEXT LEFT LAST PASSED UNIT Sat 2020-03-21 22:33:01 EDT 3min 10s left Sat 2020-03-21 21:33:27 EDT 56min ago anacron.timer Sun 2020-03-22 00:00:00 EDT 1h 30min left Sat 2020-03-21 00:00:01 EDT 22h ago logrotate.timer Sun 2020-03-22 00:00:00 EDT 1h 30min left Sat 2020-03-21 00:00:01 EDT 22h ago man-db.timer Sun 2020-03-22 00:10:05 EDT 1h 40min left Sat 2020-03-21 12:40:56 EDT 9h ago apt-daily.timer Sun 2020-03-22 03:10:22 EDT 4h 40min left Sun 2020-03-15 03:10:28 EDT 6 days ago e2scrub_all.timer Sun 2020-03-22 06:35:01 EDT 8h left Sat 2020-03-21 06:19:49 EDT 16h ago apt-daily-upgrade.timer Sun 2020-03-22 08:23:14 EDT 9h left Sat 2020-03-21 14:38:27 EDT 7h ago motd-news.timer Sun 2020-03-22 08:35:00 EDT 10h left Sat 2020-03-21 20:09:27 EDT 2h 20min ago mdmonitor-oneshot.timer Sun 2020-03-22 18:07:19 EDT 19h left Sat 2020-03-21 18:07:19 EDT 4h 22min ago systemd-tmpfiles-clean. Mon 2020-03-23 00:00:00 EDT 1 day 1h left Mon 2020-03-16 00:00:01 EDT 5 days ago fstrim.timer Sun 2020-04-05 11:39:18 EDT 2 weeks 0 days left Sun 2020-03-01 11:52:26 EST 2 weeks 6 days ago mdcheck_start.timer n/a n/a n/a n/a mdcheck_continue.timer n/a n/a n/a n/a snap-repair.timer n/a n/a n/a n/a snapd.refresh.timer n/a n/a n/a n/a snapd.snap-repair.timer 15 timers listed.
For additional information, check out:
When the Hard Disk Goes South
For basic disk errors, you could try letting Linux heal itself with fsck
at boot up.
To do this, shut down the system with the -F option like this:
sudo shutdown -r -F now
Linux reboots immediately and looks for disk errors with the fsck
command.
Confirm fixing disk errors by pressing "y" and "Enter" if prompted.
If this fails, read these articles "Disk Maintenance under Linux (Disk Recovery", How to recover partitions and data using Linux - Tutorial, and then follow the instructions carefully!
Cleaning Up After a System Crash
At some point your system will crash and you'll need to perform a manual repair of your filesystem. When this happens, you'll reboot, the system stops, and then indicates you must perform a manual repair of the filesystem.
Check & Repair Filesystem
fsck
(filesystem consistency check) is a command used to check filesystem
for consistency errors and repair them on Linux filesystems.
This tool is important for maintaining data integrity.
It should be run regularly,
especially after an unforeseen reboot (e.g. crash, power-outage).
NOTE: You need to be "root" to use
fsck
and it is very important to unmount the filesystem before running it.
First take the system to runlevel one (single user mode).
Unmount the filesystem, and then run fsck
.
For example, if the filesystem in question is /home
(or its device named /dev/sdh
) then type command:
umount /home
# or
umount /dev/sdh
Once fsck
finished, remount the filesystem:
mount /home
Now start the check/repair via the command
fsck -y /dev/sdh
fsck
will check the filesystem and ask which problems should be
fixed or corrected you don not use the -y
option.
Any files that are recovered are placed in
the /home/lost+found
directory by fsck
command.
Once fsck finished, remount the filesystem:
mount /home
Superblock Corruption
It is possible that fsck
will fail with a message telling you that
your filesystem has a bad superblock.
The filesystem's superblock contains information about the filesystems type, size, structure, etc.
This very important information, and if totally lost, its catastrophic.
Luckily, redundant superblock information is maintained,
and therefore, this too can be repaired.
You can find the location of the superblocks via
# find the location of superblocks
sudo umount /dev/sdh
sudo dumpe2fs /dev/sdh | grep -i superblock
The superblock labeled as "Primary" is the culprit, so you choose another to take its place.
Pick another superblock number and use it as the -b
options parameter.
sudo e2fsck -y -b 32768 /dev/sdh
Effectively, your running a filesystem check but using an alternate superblock. This could run for a very long time (like hours). After it completes, corruption should be removed and the primary superblock restored.
When the Filesystem is Full
Sometimes after a system crash,
you'll get a message like "The volume filesystem root has only 0 bytes disk space remaining".
If you run the df -h
command, you will in fact see it.
This is almost certainly due to being actually out of space on the root filesystem,
and not some erroneous message by Linux, so you must dig to find the source of the problem.
NOTE: A deceptive source of filling your root filesystem is when there is a failure/corruption of a disk drive mount on
/mnt
. The disk maybe unmounted, but never the less, programs can still successfully write to/mnt
directory structure. In my case, I had my backup system using/mnt/backup
and it appear everything was find, but in reality, data wasn't going to the external drive but instead to the root filesystem under/mnt
, filling up the root filesystem.To check for this,
umount
any hard drives mounted to/mnt
. This should remove the filesystem. Now look for the filesystem, and if parts of it are still there, this could very well be the source of your file space problem.
A useful command for finding what's eating up all the space is the "disk usage" command, du
.
Running the following command:
# 20 largest contributors to storage consumption
$ sudo du -ahx / | sort -rh | head -n 20
103G /
73G /var
48G /var/log
46G /var/log/uvcdynctrl-udev.log
21G /usr
13G /var/lib
11G /var/var
11G /var/lib/mlocate
9.2G /var/var/lib
9.1G /usr/usr
8.7G /var/var/lib/mlocate
5.1G /usr/lib
4.0G /Dropbox
3.9G /Dropbox/Dropbox
3.7G /usr/usr/lib
3.6G /opt
3.0G /var/var/lib/mlocate/mlocate.db
2.9G /usr/usr/local
2.9G /usr/local
2.8G /usr/usr/local/lib
This will give you the total amount of space used (-a
)
for all files not just directories,
without looking at other filesystems (-x
),
in human-readable numbers like "124M" (-h
),
and sort with the largest contributors.
Don't worry if it takes a while to complete, it could take on the order of minutes.
Don't delete files without first knowing what they are, of course. But, in general, you won't break your system if you delete files in the following directories:
/tmp
user temp data -- these are commonly all deleted every reboot anyway/var/tmp
print spools, and other system temporary data/var/cache/*
this one can be dangerous, research first!/root
the root user's home directory
In addition to the locations above, the following locations are common culprits:
/opt
many third-party apps install here, and don't clean up after themselves/var/log
log files can eat up a lot of space if there are repetitive errors
Very Large Log Files
Linux log files found in /var/log
can be a source of your filesystem full.
These log files will quickly fill if there are problems within the system.
To investigate further what may be the source your filesystem full,
find the top ten largest files and directories in /var/log
:
# print number of bytes in the 10 largest files and directories in /var/log
$ sudo du -ahx /var/log/ | sort -rh | head -n 10
48G /var/log/
46G /var/log/uvcdynctrl-udev.log
1.9G /var/log/journal/00f23270d58ed942283218b055d9d601
1.9G /var/log/journal
121M /var/log/journal/00f23270d58ed942283218b055d9d601/system@f8acccb9260a4855b984823647bc1539-000000000003a46a-00059d7019fadab5.journal
105M /var/log/journal/00f23270d58ed942283218b055d9d601/system@f8acccb9260a4855b984823647bc1539-0000000000091ce3-00059d74a2fe9ea9.journal
105M /var/log/journal/00f23270d58ed942283218b055d9d601/system@f8acccb9260a4855b984823647bc1539-0000000000075a69-00059d73c160def7.journal
89M /var/log/journal/00f23270d58ed942283218b055d9d601/system@f8acccb9260a4855b984823647bc1539-0000000000060b47-00059d72b9ef1167.journal
81M /var/log/journal/00f23270d58ed942283218b055d9d601/system@bf6951c9e0384a0b8c9b5aa044c06ddd-000000000004987b-0005a1c99c09dd09.journal
81M /var/log/journal/00f23270d58ed942283218b055d9d601/system@bf6951c9e0384a0b8c9b5aa044c06ddd-0000000000032c53-0005a03b276009d1.journal
Its very clear that we have a very bad actor in the file
/var/log/uvcdynctrl-udev.log
which is 46G in size!
In fact, this is a know problem, and judging from the files contents,
the problem was brought about due to my webcam (an old Logitech QuickCam Orbit/Sphere AF).
NOTE: We site referance above states "This package not only creates this HUGE log files, but it also causes Cheese and other web-cam apps to crash or work very badly (Can't capture video at full resolution with Cheese or Guvcview? REMOVE THIS PACKAGE AND IT WORKS AS IT SHOULD!)" Recommended package removal is
sudo apt-get remove uvcdynctrl-udev
.
To clean up this offending log file,
we can use the : >
operation to truncate the file to zero bytes.
For example, this will reduce the syslog
and kern.log
files to zero bytes:
# check the size of uvcdynctrl-udev.log $ ls -lh /var/log/uvcdynctrl-udev.log -rw-r--r-- 1 root root 46G May 22 10:39 /var/log/uvcdynctrl-udev.log # reduce the file uvcdynctrl-udev.log to zero bites sudo su cd /var/log : > uvcdynctrl-udev.log exit # recheck the size of uvcdynctrl-udev.log $ ls -lh /var/log/uvcdynctrl-udev.log -rw-r--r-- 1 root root 0 May 22 21:29 /var/log/uvcdynctrl-udev.log
Deleted But Open Files
Another potential source of a full filesystem are large
files left open but have been deleted.
On Linux, a file may be deleted (removed/unlinked) while a process has it open.
When this happens, the file is essentially invisible to other processes,
but it still takes on physical space on the drive.
Tools like du
will not see it.
To find these deleted but open files, run the utility lsof
# list open but deleted files sudo lsof -nP | grep '(deleted)'
To find out how much space is taken up by these deleted but open files, run:
# total space used by open but deleted files sudo lsof -nP | awk '/deleted/ { sum+=$8 } END { print sum }'
lsof
will list all open files,
awk
then searches for the deleted files, and sums up the file sizes (in bytes).
To get a list of process ID (PID) that own these file that are not redundant, use
# process IDs of open but deleted files sudo lsof -nP | grep '(deleted)' | awk '{ print $2 }' | sort | uniq
It's up to the filesystem driver to actually free the allocated space, and that will usually happen only once all file descriptors referring to that file are released. So you can't really reclaim the space, unless you make the application close the file. The challenge is that the responsible applications are long gone due to the system crash. How does one "delete" a file that the operating system believes has already been deleted?
Actually we cannot remove the file as long as the file is still in use by a process.
But what we can do is: Getting the size down to 0 via Linux' /proc
filesystem.
/proc
is very special in that it is also a virtual filesystem.
It's sometimes referred to as a process information pseudo-file system.
It doesn't contain 'real' files but runtime system information
(e.g. system memory, devices mounted, hardware configuration, etc).
Within /proc
, each of the numbered directories corresponds to an actual process ID.
Looking at the process table, you can match running processes with the associated process ID.
Or at least that what should be happening absent a system crash.
When the system crashed, the process is gone but the deleted but open file remains.
To fix this problem, our strategy is to truncate these deleted but open files.
Your can truncate text file and make the size to zero using redirection:
For example, if 2746
is the process ID with in /proc
with a deleted but open file,
the coresponding file must be truncate zero bytes.
This can be done by switching to full root mode, find the effected files, and truncate them.
For an example with a PID of 2746
sudo -s cd /proc/2746/fd ls -l | grep '(deleted)' | awk '{ print $9 }'
The numbers produced by the last command is the file to be truncate.
For example, if one of the numbers produce is 12
, then use this operation
to truncate the file to zero bytes
:> /proc/2746/fd/12
You could automate all of the above with this command line
# for all open but deleted files associated with process 2746, trunctate the file to 0 bytes PID=2746 ; cd /proc/$PID/fd ; ls -l | grep '(deleted)' | awk '{ print $9 }' | while read FILE; do :> /proc/$PID/fd/$FILE; done
Finally, it's a good idea to do a forced full file system check on the next reboot.
# file system check on next reboot sudo touch /forcefsck # or # shutdown and do a file system check now shutdown -rF now
Sources
I primarily consulted the following sources to create this posting:
- Why use apt-get upgrade instead of apt-get dist-upgrade?
- What Kind of Maintenance Do I Need to Do On My Linux PC?
- Is system cleanup/optimization needed
- Disk Maintenance under Linux (Disk Recovery)
- Smartmontools
- S.M.A.R.T. (Self-Monitoring, Analysis, and Reporting Technology)
- Monitoring Hard Drive Health on Linux with smartmontools
- How do I remove or hide old kernel versions to clean up the boot menu or free space?
- How to Get Rid of Deleted Open Files