Opened 3 years ago
Closed 3 years ago
#35 closed defect (fixed)
Problem with the S3QL backups for Crin1
Reported by: | chris | Owned by: | chris |
---|---|---|---|
Priority: | major | Milestone: | Maintenance |
Component: | backups | Version: | |
Keywords: | Cc: | ||
Estimated Number of Hours: | 0 | Add Hours to Ticket: | 0 |
Billable?: | yes | Total Hours: | 1.94 |
Description
Just tried mounting all the backup on Crin4 and there was a problem with the Crin1 backup:
umnt-s3ql crin1 /media/s3ql/crin1 is not on an S3QL file system Can not check mounted file system. mnt-s3ql crin1 Can not check mounted file system. Filesystem check of s3c://s.qstack.advania.com:443/crin1 failed Using 2 upload threads (memory limited). Autodetected 65498 file descriptors available for cache entries Using cached metadata. File system damaged or not unmounted cleanly, run fsck! Mounting s3c://s.qstack.advania.com:443/crin1 on /media/s3ql/crin1 failed mnt-s3ql crin1 --force Can not check mounted file system. Filesystem check of s3c://s.qstack.advania.com:443/crin1 failed Using 2 upload threads (memory limited). Autodetected 65498 file descriptors available for cache entries Using cached metadata. File system damaged or not unmounted cleanly, run fsck! Mounting s3c://s.qstack.advania.com:443/crin1 on /media/s3ql/crin1 failed
Change History (3)
comment:1 Changed 3 years ago by chris
- Add Hours to Ticket changed from 0 to 0.94
- Total Hours set to 0.94
comment:2 Changed 3 years ago by chris
- Add Hours to Ticket changed from 0 to 0.25
- Total Hours changed from 0.94 to 1.19
On the list it was suggested to check the ~/.s3ql/mount.log and this is the corresponding entry:
2015-08-13 01:40:51.051 1246:Thread-3 root.excepthook: Uncaught top-level exception: Traceback (most recent call last): File "/usr/lib/s3ql/s3ql/mount.py", line 64, in run_with_except_hook run_old(*args, **kw) File "/usr/lib/python3.4/threading.py", line 868, in run self._target(*self._args, **self._kwargs) File "/usr/lib/s3ql/s3ql/block_cache.py", line 404, in _upload_loop self._do_upload(*tmp) File "/usr/lib/s3ql/s3ql/block_cache.py", line 431, in _do_upload % obj_id).get_obj_size() File "/usr/lib/s3ql/s3ql/backends/common.py", line 52, in wrapped return method(*a, **kw) File "/usr/lib/s3ql/s3ql/backends/common.py", line 271, in perform_write return fn(fh) File "/usr/lib/s3ql/s3ql/backends/comprenc.py", line 547, in __exit__ self.close() File "/usr/lib/s3ql/s3ql/backends/comprenc.py", line 541, in close self.fh.close() File "/usr/lib/s3ql/s3ql/backends/comprenc.py", line 706, in close self.fh.close() File "/usr/lib/s3ql/s3ql/backends/common.py", line 52, in wrapped return method(*a, **kw) File "/usr/lib/s3ql/s3ql/backends/s3c.py", line 906, in close headers=self.headers, body=self.fh) File "/usr/lib/s3ql/s3ql/backends/s3c.py", line 505, in _do_request self._parse_error_response(resp) File "/usr/lib/s3ql/s3ql/backends/s3c.py", line 539, in _parse_error_response raise get_S3Error(tree.findtext('Code'), tree.findtext('Message'), resp.headers) s3ql.backends.s3c.S3Error: ServiceUnavailable: Please reduce your request rate. 2015-08-13 01:48:39.010 1246:MainThread s3ql.mount.unmount: Unmounting file system...
comment:3 Changed 3 years ago by chris
- Add Hours to Ticket changed from 0 to 0.75
- Resolution set to fixed
- Status changed from new to closed
- Total Hours changed from 1.19 to 1.94
Unmounting via fuse:
fusermount -u /tmp/s3ql_backup_1236
Filesystem check:
fsck.s3ql s3c://s.qstack.advania.com:443/crin1 Starting fsck of s3c://s.qstack.advania.com:443/crin1/ Using cached metadata. Remote metadata is outdated. Checking DB integrity... Creating temporary extra indices... Checking lost+found... Checking cached objects... Checking names (refcounts)... Checking contents (names)... Checking contents (inodes)... Checking contents (parent inodes)... Checking objects (reference counts)... Checking objects (backend)... ..processed 64000 objects so far.. Checking objects (sizes)... Checking blocks (referenced objects)... Checking blocks (refcounts)... Checking blocks (checksums)... Checking inode-block mapping (blocks)... Checking inode-block mapping (inodes)... Checking inodes (refcounts)... Checking inodes (sizes)... Checking extended attributes (names)... Checking extended attributes (inodes)... Checking symlinks (inodes)... Checking directory reachability... Checking unix conventions... Checking referential integrity... Dropping temporary indices... Dumping metadata... ..objects.. ..blocks.. ..inodes.. ..inode_blocks.. ..symlink_targets.. ..names.. ..contents.. ..ext_attributes.. Compressing and uploading metadata... Wrote 10.5 MiB of compressed metadata. Cycling metadata backups... Backing up old metadata... Unexpected server reply: expected XML, got: 200 OK x-amz-meta-006: 'data': b'XXXX x-amz-meta-007: XXXX', Content-Length: 0 x-amz-meta-005: 'signature': b'XXXX', x-amz-meta-002: 'format_version': 2, x-amz-meta-003: 'nonce': b'XXXX', x-amz-meta-000: 'object_id': 's3ql_metadata_bak_10', x-amz-meta-001: 'compression': 'None', x-amz-id-2: XXXX x-amz-meta-md5: XXXX x-amz-meta-004: 'encryption': 'AES_v2', Last-Modified: Wed, 09 Sep 2015 10:57:10 GMT ETag: "XXX" x-amz-request-id: XXX x-amz-meta-format: raw2 Content-Type: text/html; charset="UTF-8" X-Trans-Id: XXX Date: Wed, 09 Sep 2015 10:57:10 +0000 Connection: keep-alive Uncaught top-level exception: Traceback (most recent call last): File "/usr/bin/fsck.s3ql", line 9, in <module> load_entry_point('s3ql==2.13', 'console_scripts', 'fsck.s3ql')() File "/usr/lib/s3ql/s3ql/fsck.py", line 1309, in main cycle_metadata(backend) File "/usr/lib/s3ql/s3ql/metadata.py", line 121, in cycle_metadata cycle_fn("s3ql_metadata_bak_%d" % i, "s3ql_metadata_bak_%d" % (i + 1)) File "/usr/lib/s3ql/s3ql/backends/comprenc.py", line 291, in copy self._copy_or_rename(src, dest, rename=False, metadata=metadata) File "/usr/lib/s3ql/s3ql/backends/comprenc.py", line 325, in _copy_or_rename self.backend.copy(src, dest, metadata=meta_raw) File "/usr/lib/s3ql/s3ql/backends/common.py", line 52, in wrapped return method(*a, **kw) File "/usr/lib/s3ql/s3ql/backends/s3c.py", line 404, in copy root = self._parse_xml_response(resp, body) File "/usr/lib/s3ql/s3ql/backends/s3c.py", line 554, in _parse_xml_response raise RuntimeError('Unexpected server response') RuntimeError: Unexpected server response
Running it with the --backend-options="dumb-copy" however worked:
fsck.s3ql --backend-options="dumb-copy" s3c://s.qstack.advania.com:443/crin1 Starting fsck of s3c://s.qstack.advania.com:443/crin1/ Using cached metadata. Remote metadata is outdated. Checking DB integrity... Creating temporary extra indices... Checking lost+found... Checking cached objects... Checking names (refcounts)... Checking contents (names)... Checking contents (inodes)... Checking contents (parent inodes)... Checking objects (reference counts)... Checking objects (backend)... ..processed 65000 objects so far.. Checking objects (sizes)... Checking blocks (referenced objects)... Checking blocks (refcounts)... Checking blocks (checksums)... Checking inode-block mapping (blocks)... Checking inode-block mapping (inodes)... Checking inodes (refcounts)... Checking inodes (sizes)... Checking extended attributes (names)... Checking extended attributes (inodes)... Checking symlinks (inodes)... Checking directory reachability... Checking unix conventions... Checking referential integrity... Dropping temporary indices... Dumping metadata... ..objects.. ..blocks.. ..inodes.. ..inode_blocks.. ..symlink_targets.. ..names.. ..contents.. ..ext_attributes.. Compressing and uploading metadata... Wrote 10.5 MiB of compressed metadata. Cycling metadata backups... Backing up old metadata... Cleaning up local metadata... Completed fsck of s3c://s.qstack.advania.com:443/crin1/
So --backend-options="dumb-copy" has been added to the scripts and the backup job is running in screen and I have created a follow up ticket to address this issue generally ticket:39, have opened a S3QL ticket as advised, https://bitbucket.org/nikratio/s3ql/issues/161/serviceunavailable-please-reduce-your so closing this ticket.
Try to verify the filesystem:
Try that:
But it is not mounted:
However it was, so:
Looking at the email logs, this backup last run without an error on Thu, 13 Aug 2015 01:48:39 +0000, very sorry not to have noticed the problem sooner, this is the log entry from when the problem started:
I'm not sure what to so I have emailed the s3ql list about this.