Skip to content

Commit 6b285a6

Browse files
committed
health (docs): add how to reduce disk usage
1 parent 9ca1380 commit 6b285a6

1 file changed

Lines changed: 305 additions & 0 deletions

File tree

health/README.md

Lines changed: 305 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,305 @@
1+
# Health
2+
3+
## Disk usage
4+
5+
Servers can use up the disk space in a number of ways, including image dumps from amazon, logs and so forth.
6+
7+
One symptom is an error message when attempting to run a command, such as updating an ssl certificate
8+
9+
```
10+
sudo certbot certonly --manual -d 'practable.io'
11+
Traceback (most recent call last):
12+
File "/snap/certbot/2618/bin/certbot", line 8, in <module>
13+
sys.exit(main())
14+
File "/snap/certbot/2618/lib/python3.8/site-packages/certbot/main.py", line 19, in main
15+
return internal_main.main(cli_args)
16+
File "/snap/certbot/2618/lib/python3.8/site-packages/certbot/_internal/main.py", line 1700, in main
17+
log.pre_arg_parse_setup()
18+
File "/snap/certbot/2618/lib/python3.8/site-packages/certbot/_internal/log.py", line 71, in pre_arg_parse_setup
19+
temp_handler = TempHandler()
20+
File "/snap/certbot/2618/lib/python3.8/site-packages/certbot/_internal/log.py", line 267, in __init__
21+
self._workdir = tempfile.mkdtemp(prefix="certbot-log-")
22+
File "/snap/certbot/2618/usr/lib/python3.8/tempfile.py", line 486, in mkdtemp
23+
prefix, suffix, dir, output_type = _sanitize_params(prefix, suffix, dir)
24+
File "/snap/certbot/2618/usr/lib/python3.8/tempfile.py", line 256, in _sanitize_params
25+
dir = gettempdir()
26+
File "/snap/certbot/2618/usr/lib/python3.8/tempfile.py", line 425, in gettempdir
27+
tempdir = _get_default_tempdir()
28+
File "/snap/certbot/2618/usr/lib/python3.8/tempfile.py", line 357, in _get_default_tempdir
29+
raise FileNotFoundError(_errno.ENOENT,
30+
FileNotFoundError: [Errno 2] No usable temporary directory found in ['/tmp', '/var/tmp', '/usr/tmp', '/home/ubuntu']
31+
```
32+
33+
First step is to [identify which locations are using most disk](https://askubuntu.com/questions/266825/what-do-i-do-when-my-root-filesystem-is-full) using this command
34+
35+
```
36+
sudo du -hsx /* | sort -rh | head -n 40
37+
```
38+
39+
An example for a full disk is:
40+
41+
```
42+
du: cannot access '/proc/2535453/task/2535453/fd/4': No such file or directory
43+
du: cannot access '/proc/2535453/task/2535453/fdinfo/4': No such file or directory
44+
du: cannot access '/proc/2535453/fd/3': No such file or directory
45+
du: cannot access '/proc/2535453/fdinfo/3': No such file or directory
46+
4.4G /usr
47+
2.8G /var
48+
229M /home
49+
228M /boot
50+
21M /run
51+
6.5M /etc
52+
92K /root
53+
72K /tmp
54+
44K /snap
55+
16K /lost+found
56+
4.0K /srv
57+
4.0K /opt
58+
4.0K /mnt
59+
4.0K /media
60+
0 /sys
61+
0 /sbin
62+
0 /proc
63+
0 /libx32
64+
0 /lib64
65+
0 /lib32
66+
0 /lib
67+
0 /dev
68+
0 /bin
69+
```
70+
71+
Then run again in the folders with the largest usage, e.g.
72+
```
73+
$ sudo du -hsx /usr/* | sort -rh | head -n 35
74+
2.6G /usr/share
75+
924M /usr/lib
76+
583M /usr/src
77+
237M /usr/bin
78+
55M /usr/sbin
79+
17M /usr/local
80+
648K /usr/libexec
81+
112K /usr/include
82+
4.0K /usr/libx32
83+
4.0K /usr/lib64
84+
4.0K /usr/lib32
85+
4.0K /usr/games
86+
```
87+
88+
```
89+
$ sudo du -hsx /usr/share/* | sort -rh | head -n 35
90+
2.4G /usr/share/nginx
91+
34M /usr/share/vim
92+
28M /usr/share/locale
93+
28M /usr/share/doc
94+
20M /usr/share/perl
95+
18M /usr/share/man
96+
17M /usr/share/i18n
97+
7.0M /usr/share/terminfo
98+
5.9M /usr/share/X11
99+
5.7M /usr/share/mime
100+
4.9M /usr/share/mysql
101+
4.7M /usr/share/zoneinfo
102+
2.9M /usr/share/bash-completion
103+
2.8M /usr/share/fonts
104+
2.5M /usr/share/grub
105+
2.2M /usr/share/consolefonts
106+
2.1M /usr/share/perl5
107+
1.8M /usr/share/groff
108+
1.5M /usr/share/xml
109+
1.5M /usr/share/iso-codes
110+
1.5M /usr/share/alsa
111+
1.4M /usr/share/fwupd
112+
1.2M /usr/share/misc
113+
1.2M /usr/share/info
114+
668K /usr/share/polkit-1
115+
636K /usr/share/lintian
116+
584K /usr/share/ufw
117+
536K /usr/share/sounds
118+
528K /usr/share/ca-certificates
119+
500K /usr/share/initramfs-tools
120+
488K /usr/share/zoneinfo-icu
121+
460K /usr/share/calendar
122+
428K /usr/share/dbus-1
123+
384K /usr/share/bug
124+
372K /usr/share/apport
125+
```
126+
127+
```
128+
$ sudo du -hsx /usr/share/nginx/* | sort -rh | head -n 35
129+
2.2G /usr/share/nginx/pdf.gradex.io
130+
140M /usr/share/nginx/practable.io
131+
58M /usr/share/nginx/wordpress.generic
132+
20K /usr/share/nginx/modules-available
133+
8.0K /usr/share/nginx/html
134+
0 /usr/share/nginx/modules
135+
```
136+
137+
```
138+
$ sudo du -hsx /usr/share/nginx/pdf.gradex.io/* | sort -rh | head -n 35
139+
2.2G /usr/share/nginx/pdf.gradex.io/wp-content
140+
40M /usr/share/nginx/pdf.gradex.io/wp-includes
141+
9.6M /usr/share/nginx/pdf.gradex.io/wp-admin
142+
48K /usr/share/nginx/pdf.gradex.io/wp-login.php
143+
32K /usr/share/nginx/pdf.gradex.io/wp-signup.php
144+
24K /usr/share/nginx/pdf.gradex.io/wp-settings.php
145+
20K /usr/share/nginx/pdf.gradex.io/license.txt
146+
12K /usr/share/nginx/pdf.gradex.io/wp-mail.php
147+
8.0K /usr/share/nginx/pdf.gradex.io/wp-trackback.php
148+
8.0K /usr/share/nginx/pdf.gradex.io/wp-activate.php
149+
8.0K /usr/share/nginx/pdf.gradex.io/readme.html
150+
4.0K /usr/share/nginx/pdf.gradex.io/xmlrpc.php
151+
4.0K /usr/share/nginx/pdf.gradex.io/wp-load.php
152+
4.0K /usr/share/nginx/pdf.gradex.io/wp-links-opml.php
153+
4.0K /usr/share/nginx/pdf.gradex.io/wp-cron.php
154+
4.0K /usr/share/nginx/pdf.gradex.io/wp-config.php
155+
4.0K /usr/share/nginx/pdf.gradex.io/wp-config-sample.php
156+
4.0K /usr/share/nginx/pdf.gradex.io/wp-comments-post.php
157+
4.0K /usr/share/nginx/pdf.gradex.io/wp-blog-header.php
158+
4.0K /usr/share/nginx/pdf.gradex.io/index.php
159+
```
160+
```
161+
$ sudo du -hsx /usr/share/nginx/pdf.gradex.io/wp-content/* | sort -rh | head -n 35
162+
2.1G /usr/share/nginx/pdf.gradex.io/wp-content/updraft
163+
73M /usr/share/nginx/pdf.gradex.io/wp-content/uploads
164+
30M /usr/share/nginx/pdf.gradex.io/wp-content/plugins
165+
4.5M /usr/share/nginx/pdf.gradex.io/wp-content/themes
166+
3.1M /usr/share/nginx/pdf.gradex.io/wp-content/languages
167+
4.0K /usr/share/nginx/pdf.gradex.io/wp-content/upgrade
168+
4.0K /usr/share/nginx/pdf.gradex.io/wp-content/index.php
169+
```
170+
171+
Our backups have accummulated, so let's download and delete
172+
173+
```
174+
# on admin machine, in some suitable dir
175+
scp -i ~/practable-realm.pem 'ubuntu@practable.io:/usr/share/nginx/pdf.gradex.io/wp-content/updraft/backup_*' ./
176+
```
177+
178+
```
179+
on remote machine
180+
cd /usr/share/nginx/pdf.gradex.io/wp-content/updraft
181+
rm backup_*
182+
```
183+
184+
Next we look in var, which has the most stuff in lib, so look there:
185+
186+
```
187+
sudo du -hsx /var/lib/* | sort -rh | head -n 35
188+
1.7G /var/lib/snapd
189+
171M /var/lib/mysql
190+
162M /var/lib/apt
191+
39M /var/lib/dpkg
192+
<snip>
193+
```
194+
195+
```
196+
1.1G /var/lib/snapd/snaps
197+
353M /var/lib/snapd/cache
198+
252M /var/lib/snapd/seed
199+
896K /var/lib/snapd/assertions
200+
536K /var/lib/snapd/apparmor
201+
356K /var/lib/snapd/seccomp
202+
88K /var/lib/snapd/state.json
203+
40K /var/lib/snapd/sequence
204+
36K /var/lib/snapd/cookie
205+
20K /var/lib/snapd/lib
206+
20K /var/lib/snapd/desktop
207+
12K /var/lib/snapd/device
208+
12K /var/lib/snapd/dbus-1
209+
8.0K /var/lib/snapd/ssl
210+
8.0K /var/lib/snapd/mount
211+
4.0K /var/lib/snapd/void
212+
4.0K /var/lib/snapd/system-params
213+
4.0K /var/lib/snapd/system-key
214+
4.0K /var/lib/snapd/inhibit
215+
4.0K /var/lib/snapd/hostfs
216+
4.0K /var/lib/snapd/firstboot
217+
4.0K /var/lib/snapd/features
218+
4.0K /var/lib/snapd/environment
219+
4.0K /var/lib/snapd/auto-import
220+
0 /var/lib/snapd/state.lock
221+
```
222+
223+
We could [clean](https://www.debugpoint.com/clean-up-snap/) this, but we'd have to stop some snaps, so we'd need to know what snaps were being used...
224+
225+
```
226+
Name Version Rev Tracking Publisher Notes
227+
amazon-ssm-agent 3.1.1188.0 5656 latest/stable/… aws✓ disabled,classic
228+
amazon-ssm-agent 3.1.1732.0 6312 latest/stable/… aws✓ classic
229+
certbot 1.32.2 2618 latest/stable certbot-eff✓ classic
230+
certbot 1.32.1 2582 latest/stable certbot-eff✓ disabled,classic
231+
core 16-2.57.6 14399 latest/stable canonical✓ core,disabled
232+
core 16-2.58 14447 latest/stable canonical✓ core
233+
core18 20221212 2667 latest/stable canonical✓ base
234+
core18 20221205 2654 latest/stable canonical✓ base,disabled
235+
core20 20221123 1738 latest/stable canonical✓ base,disabled
236+
core20 20221212 1778 latest/stable canonical✓ base
237+
go 1.19.5 10030 latest/stable mwhudson classic
238+
go 1.19.4 10008 latest/stable mwhudson disabled,classic
239+
lxd 4.0.9-eb5e237 23991 4.0/stable/… canonical✓ disabled
240+
lxd 4.0.9-a29c6f1 24061 4.0/stable/… canonical✓ -
241+
snapd 2.58 17950 latest/stable canonical✓ snapd
242+
snapd 2.57.6 17883 latest/stable canonical✓ snapd,disabled
243+
```
244+
245+
Since the default is three revisions, it looks like we already issued this command in the past:
246+
```
247+
sudo snap set system refresh.retain=2
248+
```
249+
250+
`clean_snap.sh`:
251+
```
252+
#!/bin/bash
253+
#Removes old revisions of snaps
254+
#CLOSE ALL SNAPS BEFORE RUNNING THIS
255+
set -eu
256+
LANG=en_US.UTF-8 snap list --all | awk '/disabled/{print $1, $3}' |
257+
while read snapname revision; do
258+
snap remove "$snapname" --revision="$revision"
259+
done
260+
```
261+
262+
263+
It's not obvious how to stop core, so let's bash on....
264+
```
265+
$ sudo ./clean_snap.sh
266+
amazon-ssm-agent (revision 5656) removed
267+
certbot (revision 2582) removed
268+
core (revision 14399) removed
269+
core18 (revision 2654) removed
270+
core20 (revision 1738) removed
271+
go (revision 10008) removed
272+
lxd (revision 23991) removed
273+
snapd (revision 17883) removed
274+
```
275+
```
276+
$ sudo du -hsx /var/lib/* | sort -rh | head -n 35
277+
1.2G /var/lib/snapd
278+
171M /var/lib/mysql
279+
162M /var/lib/apt
280+
39M /var/lib/dpkg
281+
```
282+
283+
That's saved another 0.5GB. We're back to 70% usage.
284+
285+
```
286+
$ df
287+
Filesystem 1K-blocks Used Available Use% Mounted on
288+
/dev/root 8065444 5605224 2443836 70% /
289+
devtmpfs 989876 0 989876 0% /dev
290+
tmpfs 996480 0 996480 0% /dev/shm
291+
tmpfs 199296 3420 195876 2% /run
292+
tmpfs 5120 0 5120 0% /run/lock
293+
tmpfs 996480 0 996480 0% /sys/fs/cgroup
294+
tmpfs 199296 0 199296 0% /run/user/1000
295+
/dev/loop10 25088 25088 0 100% /snap/amazon-ssm-agent/6312
296+
/dev/loop0 94080 94080 0 100% /snap/lxd/24061
297+
/dev/loop2 45824 45824 0 100% /snap/certbot/2618
298+
/dev/loop8 56960 56960 0 100% /snap/core18/2667
299+
/dev/loop11 64896 64896 0 100% /snap/core20/1778
300+
/dev/loop13 51072 51072 0 100% /snap/snapd/17950
301+
/dev/loop15 119552 119552 0 100% /snap/core/14447
302+
/dev/loop17 107776 107776 0 100% /snap/go/10030
303+
```
304+
305+
We could consider increasing the size of this disk.

0 commit comments

Comments
 (0)