<?xml version="1.0"?>
<rss xmlns:dc="http://purl.org/dc/elements/1.1/" version="2.0">
  <channel>
    <title>CRIN Trac: Ticket #81: crin3 backup issue</title>
    <link>https://trac.crin.org/trac/ticket/81</link>
    <description>&lt;p&gt;
Munin email alert I'm getting every 5 mins:
&lt;/p&gt;
&lt;pre class="wiki"&gt;Date: Fri, 19 Aug 2016 09:30:16 +0000
From: munin application user &amp;lt;munin@crin1.crin.org&amp;gt;
Subject: crin3.crin.org Munin Alert
crin.org :: crin3.crin.org :: Disk usage in percent
        WARNINGs: / is 97.35 (outside range [:92]).
        OKs: /run is 10.40, /dev/shm is 0.00, /boot is 87.07, /run/lock is 0.00, /sys/fs/cgroup is 0.00, /run/user/1000 is 0.00.
&lt;/pre&gt;</description>
    <language>en-us</language>
    <image>
      <title>CRIN Trac</title>
      <url>https://trac.crin.org/trac/chrome/site/logo.gif</url>
      <link>https://trac.crin.org/trac/ticket/81</link>
    </image>
    <generator>Trac 1.0.2</generator>
    <item>
      
        <dc:creator>chris</dc:creator>

      <pubDate>Fri, 19 Aug 2016 09:56:03 GMT</pubDate>
      <title>hours changed; totalhours set</title>
      <link>https://trac.crin.org/trac/ticket/81#comment:1</link>
      <guid isPermaLink="false">https://trac.crin.org/trac/ticket/81#comment:1</guid>
      <description>
          &lt;ul&gt;
            &lt;li&gt;&lt;strong&gt;hours&lt;/strong&gt;
                changed from &lt;em&gt;0&lt;/em&gt; to &lt;em&gt;0.25&lt;/em&gt;
            &lt;/li&gt;
            &lt;li&gt;&lt;strong&gt;totalhours&lt;/strong&gt;
                set to &lt;em&gt;0.25&lt;/em&gt;
            &lt;/li&gt;
          &lt;/ul&gt;
        &lt;p&gt;
It looks like it is because something has gone wrong with a backup job, two are running but only one &lt;tt&gt;s3ql&lt;/tt&gt; filesystem is connected:
&lt;/p&gt;
&lt;pre class="wiki"&gt;df -h
Filesystem                             Size  Used Avail Use% Mounted on
udev                                   489M     0  489M   0% /dev
tmpfs                                  100M   11M   90M  11% /run
/dev/mapper/CRIN3--vg-root              15G   14G  378M  98% /
tmpfs                                  499M     0  499M   0% /dev/shm
tmpfs                                  5.0M     0  5.0M   0% /run/lock
tmpfs                                  499M     0  499M   0% /sys/fs/cgroup
/dev/sda1                              236M  195M   29M  88% /boot
crin1:/                                121G   27G   88G  24% /media/sshfs/crin1
s3c://s.qstack.advania.com:443/crin1/  1.0T  252G  773G  25% /media/s3ql/crin1
crin2:/                                121G   24G   91G  21% /media/sshfs/crin2
tmpfs                                  100M     0  100M   0% /run/user/1000
&lt;/pre&gt;&lt;pre class="wiki"&gt;ps -lA | grep rsync
0 S     0  2290  1743  0  80   0 -  5960 -      ?        00:00:23 rsync
1 S     0  2291  2290  0  80   0 -  5869 -      ?        00:00:00 rsync
1 S     0  2292  2291  0  80   0 -  5917 -      ?        00:00:24 rsync
&lt;/pre&gt;&lt;p&gt;
So:
&lt;/p&gt;
&lt;pre class="wiki"&gt;killall rsync
ps -lA | grep rsync
  1 S     0  2292     1  0  80   0 -  5917 -      ?        00:00:24 rsync
kill 2292
ps -lA | grep rsync
  1 D     0  2292     1  0  80   0 -  5917 -      ?        00:00:24 rsync
&lt;/pre&gt;&lt;p&gt;
This seems to be un unkillable process, rebooting the server and I'll check how it looks later.
&lt;/p&gt;
      </description>
      <category>Ticket</category>
    </item><item>
      
        <dc:creator>chris</dc:creator>

      <pubDate>Fri, 19 Aug 2016 21:04:48 GMT</pubDate>
      <title>hours, totalhours changed</title>
      <link>https://trac.crin.org/trac/ticket/81#comment:2</link>
      <guid isPermaLink="false">https://trac.crin.org/trac/ticket/81#comment:2</guid>
      <description>
          &lt;ul&gt;
            &lt;li&gt;&lt;strong&gt;hours&lt;/strong&gt;
                changed from &lt;em&gt;0&lt;/em&gt; to &lt;em&gt;0.25&lt;/em&gt;
            &lt;/li&gt;
            &lt;li&gt;&lt;strong&gt;totalhours&lt;/strong&gt;
                changed from &lt;em&gt;0.25&lt;/em&gt; to &lt;em&gt;0.5&lt;/em&gt;
            &lt;/li&gt;
          &lt;/ul&gt;
        &lt;p&gt;
Checking where space usage is:
&lt;/p&gt;
&lt;pre class="wiki"&gt;cd /
du -h --max-depth=1
8.1M    ./bin
4.0K    ./opt
1.4G    ./lib
7.0M    ./sbin
193M    ./boot
24K     ./tmp
1.5G    ./usr
5.4M    ./run
0       ./sys
5.2M    ./etc
40K     ./media
9.3G    ./root
0       ./dev
746M    ./var
4.0K    ./mnt
4.0K    ./srv
4.0K    ./lib64
104K    ./home
16K     ./lost+found
du: cannot access './proc/24896/task/24896/fd/4': No such file or directory
du: cannot access './proc/24896/task/24896/fdinfo/4': No such file or directory
du: cannot access './proc/24896/fd/3': No such file or directory
du: cannot access './proc/24896/fdinfo/3': No such file or directory
0       ./proc
14G     .
cd /root/
du -h --max-depth=1
16K     ./.aptitude
516K    ./.Changelog
4.0K    ./Mail
20K     ./.ssh
9.3G    ./.s3ql
4.0K    ./.nano
9.3G    .
ls -lah
total 4.1G
drwx------ 3 root root 4.0K Aug 19 10:12 .
drwx------ 8 root root 4.0K Aug 19 10:03 ..
-rw------- 1 root root  456 Jul 25  2015 authinfo2
-rw------- 1 root root  218 Jun  2  2015 authinfo2.db1
-rw------- 1 root root  226 Jun  2  2015 authinfo2.greenqloud.db1
-rw------- 1 root root  218 Jun  2  2015 authinfo2.web1
-rw------- 1 root root  218 May 18  2015 authinfo2.web2
-rw------- 1 root root  218 May 19  2015 authinfo2.wiki
-rw-r--r-- 1 root root 8.2M Aug 19 10:00 fsck.log
-rw-r--r-- 1 root root 1.0M Feb 27 00:34 fsck.log.1
-rw-r--r-- 1 root root  46K Feb 17  2016 fsck.log.2
-rw-r--r-- 1 root root 1.0M Feb 17  2016 fsck.log.3
-rw-r--r-- 1 root root 1.0M Dec 18  2015 fsck.log.4
-rw-r--r-- 1 root root 1.0M Oct  1  2015 fsck.log.5
-rw-r--r-- 1 root root 2.5M Aug 19 10:16 mount.log
-rw-r--r-- 1 root root 1.0M Dec 27  2015 mount.log.1
-rw-r--r-- 1 root root    0 Jul 23  2015 mount.s3ql_crit.log
drwxr-xr-x 2 root root  64K Aug 17 00:09 s3c:=2F=2Fs.qstack.advania.com:443=2Fcrin1=2F-cache
-rw------- 1 root root 1.1G Aug 19 00:03 s3c:=2F=2Fs.qstack.advania.com:443=2Fcrin1=2F.db
-rw-r--r-- 1 root root  200 Aug 17 00:04 s3c:=2F=2Fs.qstack.advania.com:443=2Fcrin1=2F.params
-rw------- 1 root root 2.5G Aug 19 09:52 s3c:=2F=2Fs.qstack.advania.com:443=2Fcrin2=2F.db
-rw-r--r-- 1 root root  201 Aug 19 09:52 s3c:=2F=2Fs.qstack.advania.com:443=2Fcrin2=2F.params
-rw------- 1 root root 517M Aug 19 10:16 s3c:=2F=2Fs.qstack.advania.com:443=2Fcrin4=2F.db
-rw-r--r-- 1 root root  200 Aug 19 10:15 s3c:=2F=2Fs.qstack.advania.com:443=2Fcrin4=2F.params
&lt;/pre&gt;&lt;p&gt;
So, I'm not sure what can be done to create more space here...
&lt;/p&gt;
      </description>
      <category>Ticket</category>
    </item><item>
      
        <dc:creator>chris</dc:creator>

      <pubDate>Fri, 19 Aug 2016 21:14:20 GMT</pubDate>
      <title>hours, totalhours changed</title>
      <link>https://trac.crin.org/trac/ticket/81#comment:3</link>
      <guid isPermaLink="false">https://trac.crin.org/trac/ticket/81#comment:3</guid>
      <description>
          &lt;ul&gt;
            &lt;li&gt;&lt;strong&gt;hours&lt;/strong&gt;
                changed from &lt;em&gt;0&lt;/em&gt; to &lt;em&gt;0.25&lt;/em&gt;
            &lt;/li&gt;
            &lt;li&gt;&lt;strong&gt;totalhours&lt;/strong&gt;
                changed from &lt;em&gt;0.5&lt;/em&gt; to &lt;em&gt;0.75&lt;/em&gt;
            &lt;/li&gt;
          &lt;/ul&gt;
        &lt;p&gt;
I have stopped the Munin alerts for now by upping the alert level to 94% rather than 92% by editing &lt;tt&gt;/etc/munin/plugin-conf.d/munin-node&lt;/tt&gt;:
&lt;/p&gt;
&lt;pre class="wiki"&gt;[df*]
env.warning 94
env.critical 98
&lt;/pre&gt;&lt;p&gt;
Restart &lt;tt&gt;munin-node&lt;/tt&gt;:
&lt;/p&gt;
&lt;pre class="wiki"&gt;/etc/init.d/munin-node restart
[ ok ] Restarting munin-node (via systemctl): munin-node.service.
&lt;/pre&gt;&lt;p&gt;
I have also deleted some of the backups of the &lt;tt&gt;Changelog&lt;/tt&gt;:
&lt;/p&gt;
&lt;pre class="wiki"&gt;cd /root/.Changelog
rm -f Changelog.2015*
rm -f Changelog.2016-01*
rm -f Changelog.2016-02*
rm -f Changelog.2016-03*
rm -f Changelog.2016-04*
rm -f Changelog.2016-05*
rm -f Changelog.2016-06*
rm -f Changelog.2016-07*
&lt;/pre&gt;&lt;p&gt;
And cleaned up the apt archive:
&lt;/p&gt;
&lt;pre class="wiki"&gt;apt-get clean
&lt;/pre&gt;&lt;p&gt;
And that has brought some time, the next thing to try is deleting old backups to see if that frees up space by reducing the amount of metadata stored:
&lt;/p&gt;
&lt;pre class="wiki"&gt;df -h
Filesystem                  Size  Used Avail Use% Mounted on
udev                        489M     0  489M   0% /dev
tmpfs                       100M  5.4M   95M   6% /run
/dev/mapper/CRIN3--vg-root   15G   13G  1.6G  89% /
tmpfs                       499M     0  499M   0% /dev/shm
tmpfs                       5.0M     0  5.0M   0% /run/lock
tmpfs                       499M     0  499M   0% /sys/fs/cgroup
/dev/sda1                   236M  195M   29M  88% /boot
tmpfs                       100M     0  100M   0% /run/user/1000
&lt;/pre&gt;
      </description>
      <category>Ticket</category>
    </item><item>
      
        <dc:creator>chris</dc:creator>

      <pubDate>Thu, 15 Sep 2016 08:47:32 GMT</pubDate>
      <title>hours, totalhours changed</title>
      <link>https://trac.crin.org/trac/ticket/81#comment:4</link>
      <guid isPermaLink="false">https://trac.crin.org/trac/ticket/81#comment:4</guid>
      <description>
          &lt;ul&gt;
            &lt;li&gt;&lt;strong&gt;hours&lt;/strong&gt;
                changed from &lt;em&gt;0&lt;/em&gt; to &lt;em&gt;0.1&lt;/em&gt;
            &lt;/li&gt;
            &lt;li&gt;&lt;strong&gt;totalhours&lt;/strong&gt;
                changed from &lt;em&gt;0.75&lt;/em&gt; to &lt;em&gt;0.85&lt;/em&gt;
            &lt;/li&gt;
          &lt;/ul&gt;
        &lt;p&gt;
This came up again:
&lt;/p&gt;
&lt;pre class="wiki"&gt;crin.org :: crin3.crin.org :: Disk usage in percent
        WARNINGs: / is 97.81 (outside range [:94]).
        OKs: /run is 10.34, /run/lock is 0.00, /boot is 87.07, /sys/fs/cgroup is 0.00, /dev/shm is 0.00.
&lt;/pre&gt;&lt;p&gt;
But it has dropped back down now:
&lt;/p&gt;
&lt;pre class="wiki"&gt; df -h
Filesystem                  Size  Used Avail Use% Mounted on
udev                        489M     0  489M   0% /dev
tmpfs                       100M   11M   90M  11% /run
/dev/mapper/CRIN3--vg-root   15G   12G  2.4G  84% /
tmpfs                       499M     0  499M   0% /dev/shm
tmpfs                       5.0M     0  5.0M   0% /run/lock
tmpfs                       499M     0  499M   0% /sys/fs/cgroup
/dev/sda1                   236M  195M   29M  88% /boot
crin1:/                     121G   27G   88G  24% /media/sshfs/crin1
tmpfs                       100M     0  100M   0% /run/user/1000
&lt;/pre&gt;&lt;p&gt;
I guess it is caused by tmp files or something, at some point we will probably have to make this server bigger.
&lt;/p&gt;
      </description>
      <category>Ticket</category>
    </item>
 </channel>
</rss>