Opened 3 years ago
Closed 3 years ago
#35 closed defect (fixed)
Problem with the S3QL backups for Crin1
| Reported by: | chris | Owned by: | chris |
|---|---|---|---|
| Priority: | major | Milestone: | Maintenance |
| Component: | backups | Version: | |
| Keywords: | Cc: | ||
| Estimated Number of Hours: | 0 | Add Hours to Ticket: | 0 |
| Billable?: | yes | Total Hours: | 1.94 |
Description
Just tried mounting all the backup on Crin4 and there was a problem with the Crin1 backup:
umnt-s3ql crin1 /media/s3ql/crin1 is not on an S3QL file system Can not check mounted file system. mnt-s3ql crin1 Can not check mounted file system. Filesystem check of s3c://s.qstack.advania.com:443/crin1 failed Using 2 upload threads (memory limited). Autodetected 65498 file descriptors available for cache entries Using cached metadata. File system damaged or not unmounted cleanly, run fsck! Mounting s3c://s.qstack.advania.com:443/crin1 on /media/s3ql/crin1 failed mnt-s3ql crin1 --force Can not check mounted file system. Filesystem check of s3c://s.qstack.advania.com:443/crin1 failed Using 2 upload threads (memory limited). Autodetected 65498 file descriptors available for cache entries Using cached metadata. File system damaged or not unmounted cleanly, run fsck! Mounting s3c://s.qstack.advania.com:443/crin1 on /media/s3ql/crin1 failed
Change History (3)
comment:1 Changed 3 years ago by chris
- Add Hours to Ticket changed from 0 to 0.94
- Total Hours set to 0.94
comment:2 Changed 3 years ago by chris
- Add Hours to Ticket changed from 0 to 0.25
- Total Hours changed from 0.94 to 1.19
On the list it was suggested to check the ~/.s3ql/mount.log and this is the corresponding entry:
2015-08-13 01:40:51.051 1246:Thread-3 root.excepthook: Uncaught top-level exception:
Traceback (most recent call last):
File "/usr/lib/s3ql/s3ql/mount.py", line 64, in run_with_except_hook
run_old(*args, **kw)
File "/usr/lib/python3.4/threading.py", line 868, in run
self._target(*self._args, **self._kwargs)
File "/usr/lib/s3ql/s3ql/block_cache.py", line 404, in _upload_loop
self._do_upload(*tmp)
File "/usr/lib/s3ql/s3ql/block_cache.py", line 431, in _do_upload
% obj_id).get_obj_size()
File "/usr/lib/s3ql/s3ql/backends/common.py", line 52, in wrapped
return method(*a, **kw)
File "/usr/lib/s3ql/s3ql/backends/common.py", line 271, in perform_write
return fn(fh)
File "/usr/lib/s3ql/s3ql/backends/comprenc.py", line 547, in __exit__
self.close()
File "/usr/lib/s3ql/s3ql/backends/comprenc.py", line 541, in close
self.fh.close()
File "/usr/lib/s3ql/s3ql/backends/comprenc.py", line 706, in close
self.fh.close()
File "/usr/lib/s3ql/s3ql/backends/common.py", line 52, in wrapped
return method(*a, **kw)
File "/usr/lib/s3ql/s3ql/backends/s3c.py", line 906, in close
headers=self.headers, body=self.fh)
File "/usr/lib/s3ql/s3ql/backends/s3c.py", line 505, in _do_request
self._parse_error_response(resp)
File "/usr/lib/s3ql/s3ql/backends/s3c.py", line 539, in _parse_error_response
raise get_S3Error(tree.findtext('Code'), tree.findtext('Message'), resp.headers)
s3ql.backends.s3c.S3Error: ServiceUnavailable: Please reduce your request rate.
2015-08-13 01:48:39.010 1246:MainThread s3ql.mount.unmount: Unmounting file system...
comment:3 Changed 3 years ago by chris
- Add Hours to Ticket changed from 0 to 0.75
- Resolution set to fixed
- Status changed from new to closed
- Total Hours changed from 1.19 to 1.94
Unmounting via fuse:
fusermount -u /tmp/s3ql_backup_1236
Filesystem check:
fsck.s3ql s3c://s.qstack.advania.com:443/crin1
Starting fsck of s3c://s.qstack.advania.com:443/crin1/
Using cached metadata.
Remote metadata is outdated.
Checking DB integrity...
Creating temporary extra indices...
Checking lost+found...
Checking cached objects...
Checking names (refcounts)...
Checking contents (names)...
Checking contents (inodes)...
Checking contents (parent inodes)...
Checking objects (reference counts)...
Checking objects (backend)...
..processed 64000 objects so far..
Checking objects (sizes)...
Checking blocks (referenced objects)...
Checking blocks (refcounts)...
Checking blocks (checksums)...
Checking inode-block mapping (blocks)...
Checking inode-block mapping (inodes)...
Checking inodes (refcounts)...
Checking inodes (sizes)...
Checking extended attributes (names)...
Checking extended attributes (inodes)...
Checking symlinks (inodes)...
Checking directory reachability...
Checking unix conventions...
Checking referential integrity...
Dropping temporary indices...
Dumping metadata...
..objects..
..blocks..
..inodes..
..inode_blocks..
..symlink_targets..
..names..
..contents..
..ext_attributes..
Compressing and uploading metadata...
Wrote 10.5 MiB of compressed metadata.
Cycling metadata backups...
Backing up old metadata...
Unexpected server reply: expected XML, got:
200 OK
x-amz-meta-006: 'data': b'XXXX
x-amz-meta-007: XXXX',
Content-Length: 0
x-amz-meta-005: 'signature': b'XXXX',
x-amz-meta-002: 'format_version': 2,
x-amz-meta-003: 'nonce': b'XXXX',
x-amz-meta-000: 'object_id': 's3ql_metadata_bak_10',
x-amz-meta-001: 'compression': 'None',
x-amz-id-2: XXXX
x-amz-meta-md5: XXXX
x-amz-meta-004: 'encryption': 'AES_v2',
Last-Modified: Wed, 09 Sep 2015 10:57:10 GMT
ETag: "XXX"
x-amz-request-id: XXX
x-amz-meta-format: raw2
Content-Type: text/html; charset="UTF-8"
X-Trans-Id: XXX
Date: Wed, 09 Sep 2015 10:57:10 +0000
Connection: keep-alive
Uncaught top-level exception:
Traceback (most recent call last):
File "/usr/bin/fsck.s3ql", line 9, in <module>
load_entry_point('s3ql==2.13', 'console_scripts', 'fsck.s3ql')()
File "/usr/lib/s3ql/s3ql/fsck.py", line 1309, in main
cycle_metadata(backend)
File "/usr/lib/s3ql/s3ql/metadata.py", line 121, in cycle_metadata
cycle_fn("s3ql_metadata_bak_%d" % i, "s3ql_metadata_bak_%d" % (i + 1))
File "/usr/lib/s3ql/s3ql/backends/comprenc.py", line 291, in copy
self._copy_or_rename(src, dest, rename=False, metadata=metadata)
File "/usr/lib/s3ql/s3ql/backends/comprenc.py", line 325, in _copy_or_rename
self.backend.copy(src, dest, metadata=meta_raw)
File "/usr/lib/s3ql/s3ql/backends/common.py", line 52, in wrapped
return method(*a, **kw)
File "/usr/lib/s3ql/s3ql/backends/s3c.py", line 404, in copy
root = self._parse_xml_response(resp, body)
File "/usr/lib/s3ql/s3ql/backends/s3c.py", line 554, in _parse_xml_response
raise RuntimeError('Unexpected server response')
RuntimeError: Unexpected server response
Running it with the --backend-options="dumb-copy" however worked:
fsck.s3ql --backend-options="dumb-copy" s3c://s.qstack.advania.com:443/crin1 Starting fsck of s3c://s.qstack.advania.com:443/crin1/ Using cached metadata. Remote metadata is outdated. Checking DB integrity... Creating temporary extra indices... Checking lost+found... Checking cached objects... Checking names (refcounts)... Checking contents (names)... Checking contents (inodes)... Checking contents (parent inodes)... Checking objects (reference counts)... Checking objects (backend)... ..processed 65000 objects so far.. Checking objects (sizes)... Checking blocks (referenced objects)... Checking blocks (refcounts)... Checking blocks (checksums)... Checking inode-block mapping (blocks)... Checking inode-block mapping (inodes)... Checking inodes (refcounts)... Checking inodes (sizes)... Checking extended attributes (names)... Checking extended attributes (inodes)... Checking symlinks (inodes)... Checking directory reachability... Checking unix conventions... Checking referential integrity... Dropping temporary indices... Dumping metadata... ..objects.. ..blocks.. ..inodes.. ..inode_blocks.. ..symlink_targets.. ..names.. ..contents.. ..ext_attributes.. Compressing and uploading metadata... Wrote 10.5 MiB of compressed metadata. Cycling metadata backups... Backing up old metadata... Cleaning up local metadata... Completed fsck of s3c://s.qstack.advania.com:443/crin1/
So --backend-options="dumb-copy" has been added to the scripts and the backup job is running in screen and I have created a follow up ticket to address this issue generally ticket:39, have opened a S3QL ticket as advised, https://bitbucket.org/nikratio/s3ql/issues/161/serviceunavailable-please-reduce-your so closing this ticket.

Try to verify the filesystem:
Try that:
But it is not mounted:
However it was, so:
Looking at the email logs, this backup last run without an error on Thu, 13 Aug 2015 01:48:39 +0000, very sorry not to have noticed the problem sooner, this is the log entry from when the problem started:
sent 117 bytes received 12 bytes 258.00 bytes/sec total size is 3,472 speedup is 26.91 Uncaught top-level exception: Traceback (most recent call last): File "/usr/bin/umount.s3ql", line 9, in <module> load_entry_point('s3ql==2.13', 'console_scripts', 'umount.s3ql')() File "/usr/lib/s3ql/s3ql/umount.py", line 184, in main blocking_umount(options.mountpoint) File "/usr/lib/s3ql/s3ql/umount.py", line 123, in blocking_umount pid = pickle.loads(llfuse.getxattr(ctrlfile, 's3ql_pid?')) File "fuse_api.pxi", line 206, in llfuse.capi.getxattr (src/llfuse/capi_linux.c:21046) ConnectionAbortedError: [Errno 103] Software caused connection abort: '/tmp/s3ql_backup_1236/.__s3ql__ctrl__'I'm not sure what to so I have emailed the s3ql list about this.