Opened 3 years ago

Closed 3 years ago

#34 closed defect (fixed)

Memcache Munin Stats

Reported by: chris Owned by: chris
Priority: major Milestone: Maintenance
Component: munin Version:
Keywords: Cc: peter, mori
Estimated Number of Hours: 0 Add Hours to Ticket: 0
Billable?: yes Total Hours: 1.55

Description (last modified by chris)

Munin stats for memcache stopped at the and of August 2015 and I'm not sure why.

Attachments (6)

Change History (16)

Changed 3 years ago by chris

Changed 3 years ago by chris

comment:1 Changed 3 years ago by chris

  • Description modified (diff)

Changed 3 years ago by chris

comment:2 Changed 3 years ago by chris

  • Add Hours to Ticket changed from 0 to 0.5
  • Total Hours set to 0.5

The plugin was configured on ticket:10#comment:8.

Munin nodes was restarted in case that helps:

/etc/init.d/munin-node status
● munin-node.service - Munin Node
   Loaded: loaded (/lib/systemd/system/munin-node.service; enabled)
   Active: active (running) since Tue 2015-06-23 20:29:12 GMT; 2 months 14 days ago
     Docs: man:munin-node(1)
 Main PID: 1343 (munin-node)
   CGroup: /system.slice/munin-node.service
           └─1343 /usr/bin/perl -wT /usr/sbin/munin-node

Warning: Journal has been rotated since unit was started. Log output is incomplete or unavailable.

/etc/init.d/munin-node restart
[ ok ] Restarting munin-node (via systemctl): munin-node.service.

Perhaps the problem could be related by the lack of a log file, in /etc/memcached.conf we have:

logfile /var/log/memcached.log

But this file doesn't exist.

It seems to be running OK, I restarted it the other day when I noticed this issues hence the low uptime:

 service memcached status
● memcached.service - memcached daemon
   Loaded: loaded (/lib/systemd/system/memcached.service; enabled)
   Active: active (running) since Sat 2015-09-05 15:45:45 GMT; 1 day 18h ago
 Main PID: 23927 (memcached)
   CGroup: /system.slice/memcached.service
           └─23927 /usr/bin/memcached -m 512 -p 11211 -u memcache -l

However memory usage seems to have dropped to 2.47M from 400M after that restart:

Running the Munin plugin on the command line:

cd /etc/munin/plugins/
munin-run memcached_bytes 
  memcache_bytes_read.value 10710
  memcache_bytes_written.value 1697558
munin-run memcached_counters 
  memcache_bytes_allocated.value 0
  memcache_curr_connections.value 5
  memcache_curr_items.value 0
munin-run memcached_rates 
  memcache_cache_hits.value 0
  memcache_cache_misses.value 0
  memcache_cmd_get.value 0
  memcache_cmd_set.value 0
  memcache_total_connections.value 1537
  memcache_total_items.value 0

All three plugins are actually a symlink to /usr/share/munin/plugins/memcached_ which is a Perl script.

comment:3 Changed 3 years ago by chris

  • Add Hours to Ticket changed from 0 to 0.1
  • Total Hours changed from 0.5 to 0.6

The php-fom server was last restarted, 11 days ago, with the last update, recorded on ticket:17#comment:22:

service php5-fpm status
● php5-fpm.service - The PHP FastCGI Process Manager
   Loaded: loaded (/lib/systemd/system/php5-fpm.service; enabled)
   Active: active (running) since Thu 2015-08-27 15:36:46 GMT; 1 weeks 3 days ago
 Main PID: 21709 (php5-fpm)
   Status: "Processes active: 1, idle: 17, Requests: 922589, slow: 0, Traffic: 0.3req/sec"
   CGroup: /system.slice/php5-fpm.service
           ├─ 9774 php-fpm: pool www
           ├─17518 php-fpm: pool www
           ├─20968 php-fpm: pool www
           ├─20969 php-fpm: pool www
           ├─20970 php-fpm: pool www
           ├─20971 php-fpm: pool www
           ├─21709 php-fpm: master process (/etc/php5/fpm/php-fpm.conf)
           ├─23244 php-fpm: pool www
           ├─23724 php-fpm: pool www
           ├─24298 php-fpm: pool www
           ├─24299 php-fpm: pool www
           ├─24300 php-fpm: pool www
           ├─24301 php-fpm: pool www
           ├─24302 php-fpm: pool www
           ├─24303 php-fpm: pool www
           ├─24304 php-fpm: pool www
           ├─24305 php-fpm: pool www
           ├─24306 php-fpm: pool www
           └─29397 php-fpm: pool www

Trying a php-fpm5 and memcache restart...

Changed 3 years ago by chris

Changed 3 years ago by chris

comment:4 Changed 3 years ago by chris

  • Add Hours to Ticket changed from 0 to 0.15
  • Total Hours changed from 0.6 to 0.75

There was a massive PHP load spike at the time that memcache stopped working:

Since then there has been a noticeable increase in the number of busy php-fpm processes:

Changed 3 years ago by chris

comment:5 Changed 3 years ago by chris

  • Add Hours to Ticket changed from 0 to 0.25
  • Cc peter added
  • Total Hours changed from 0.75 to 1.0

There was also a couple of ssh sessions opened by peter at the time:

Peter -- do you have any idea why Memcache stopped being used after 11pm on 31st August -- you had a couple of ssh sessions open at the time, did you make some changes to the site?

comment:6 Changed 3 years ago by chris

  • Add Hours to Ticket changed from 0 to 0.25
  • Total Hours changed from 1.0 to 1.25

Peter has reported that it was probably a change in settings.php that caused this, so here is a diff (omitting passwords etc), generated on Crin3:

diff /media/s3ql/crin2/2015-08-30_03\:01\:11/var/www/prod/docroot/sites/default/settings.php /media/s3ql/crin2/2015-09-01_03\:01\:13/var/www/prod/docroot/sites/default/settings.php

<  *     'prefix' => '',
>  *     'prefix' => '',git
< /* old settings
<  * $databases = array (
<  *   'default' => 
<  *   array (
<  *     'default' => 
<  *     array (
<  *       'database' => 'crin_db',
<  *       'username' => 'crin_dbu',
<  *       'password' => 'XXXX',
<  *       'host' => '',
<  *       'port' => '',
<  *       'driver' => 'mysql',
<  *       'prefix' => '',
<  *     ),
<  *   ),
<  * );
<  */
< $databases = array (
<   'default' =>
<   array (
<     'default' =>
<     array (
<       'database' => 'drupal',
<       'username' => 'drupal',
<       'password' => 'XXXX',
<       'host' => 'crin1',
<       'port' => '',
<       'driver' => 'mysql',
<       'prefix' => '',
<       'pdo' => array(
<            PDO::MYSQL_ATTR_SSL_KEY => '/etc/ssl/cacert/crin1_yassl_privatekey.pem',
<            PDO::MYSQL_ATTR_SSL_CERT => '/etc/ssl/cacert/crin1_cert.pem',
<            PDO::MYSQL_ATTR_SSL_CA => '/etc/ssl/cacert/cacert.pem',
<         ),
<     ),
<   ),
< );
>  //details of newprod
>  $databases = array(
>    'default' =>
>      array(
>        'default' =>
>          array(
>            'database' => 'newprod',
>            'username' => 'newprod',
>            'password' => 'XXXX',
>            'host' => 'crin1',
>            'port' => '',
>            'driver' => 'mysql',
>            'prefix' => '',
>            'pdo' => array(
>              PDO::MYSQL_ATTR_SSL_KEY => '/etc/ssl/cacert/crin1_yassl_privatekey.pem',
>              PDO::MYSQL_ATTR_SSL_CERT => '/etc/ssl/cacert/crin1_cert.pem',
>              PDO::MYSQL_ATTR_SSL_CA => '/etc/ssl/cacert/cacert.pem',
>        ),
>      ),
>    ),
>  );
> //Load local settings file
> if (file_exists(DRUPAL_ROOT . '/' . conf_path() . '/')) {
>   include DRUPAL_ROOT . '/' . conf_path() . '/';
> }
< /* $drupal_hash_salt = 'XXXXXX';
<    updated by chris on 2014-04-09
< */
< $drupal_hash_salt = 'XXXXXX';
> $drupal_hash_salt = 'XXXXXX';
< # $conf['allow_authorize_operations'] = FALSE
< #include './sites/all/modules/domain/';
< /**
<  * Add the domain module setup routine.
<  */
< include DRUPAL_ROOT . '/sites/all/modules/domain/';
< /** The following lines were commented out as memcache wasn't properly configured
<  *  2015-03-16
<  *
<  * $conf['cache_backends'][] = 'sites/all/modules/memcache/';
<  *  // The 'cache_form' bin must be assigned no non-volatile storage.
<  * $conf['cache_class_cache_form'] = 'DrupalDatabaseCache';
<  * $conf['cache_default_class'] = 'MemCacheDrupal';
<  * $conf['memcache_key_prefix'] = 'crin_';
<  */
> # $conf['allow_authorize_operations'] = FALSE;
< include_once DRUPAL_ROOT . '/includes/';
< include_once DRUPAL_ROOT . '/sites/all/modules/memcache/';
< $conf['cache_default_class'] = 'MemCacheDrupal';
> // Please don't edit anything between <DDSETTINGS> tags.
> // This section is autogenerated by Acquia Dev Desktop.
> }
> // Include Domain Access
> include DRUPAL_ROOT . '/sites/all/modules/contrib/domain/';
Last edited 3 years ago by chris (previous) (diff)

comment:7 Changed 3 years ago by mori

Hi Chris,

Thanks for Sorry we probably didn't inform you, but we've launched the CRIN website under the new codebase on 31/8.

The site that is now in production does not have the config in place to utilise memcache so I will add it back in. The previous production site had the following:

/** The following lines were commented out as memcache wasn't properly configured
 *  2015-03-16
 * $conf['cache_backends'][] = 'sites/all/modules/memcache/';
 *  // The 'cache_form' bin must be assigned no non-volatile storage.
 * $conf['cache_class_cache_form'] = 'DrupalDatabaseCache';
 * $conf['cache_default_class'] = 'MemCacheDrupal';
 * $conf['memcache_key_prefix'] = 'crin_';

The new config I'd propose is as follows. If you don't see any issues I'll go ahead and deploy the change.

include_once DRUPAL_ROOT . '/includes/';
include_once DRUPAL_ROOT . '/sites/all/modules/memcache/';
$conf['cache_default_class'] = 'MemCacheDrupal';
// The 'cache_form' bin must be assigned no non-volatile storage.
$conf['cache_class_cache_form'] = 'DrupalDatabaseCache';
Version 0, edited 3 years ago by mori (next)

comment:8 Changed 3 years ago by chris

  • Add Hours to Ticket changed from 0 to 0.15
  • Cc mori added
  • Total Hours changed from 1.25 to 1.4

Fine to go ahead with the new config, I see we now have memcache graphs for Crin4 showing that it is working on the dev/staging server :-)

comment:9 follow-up: Changed 3 years ago by mori

Thanks for your confirmation Chris. The change has been deployed on Prod. Can you please confirm if memcache is working for Prod again?

comment:10 in reply to: ↑ 9 Changed 3 years ago by chris

  • Add Hours to Ticket changed from 0 to 0.15
  • Resolution set to fixed
  • Status changed from new to closed
  • Total Hours changed from 1.4 to 1.55

Replying to mori:

Can you please confirm if memcache is working for Prod again?

Yes, Memcache is being used again:

cd /etc/munin/plugins/
munin-run memcached_bytes 
  memcache_bytes_read.value 150786679
  memcache_bytes_written.value 340616813
munin-run memcached_rates 
  memcache_cache_hits.value 157034
  memcache_cache_misses.value 13026
  memcache_cmd_get.value 170060
  memcache_cmd_set.value 29120
  memcache_total_connections.value 1455
  memcache_total_items.value 29120

And stats have started to appear via the Munin graphs, closing this ticket, thanks for fixing it!

Note: See TracTickets for help on using tickets.