29 July 2010 - 12:00v0.21 released

It’s been a while, but v0.21 is ready.  Most of the work this time around has been on stability. There is one key new feature, however: RBD, the rados block device, which let you create a virtual disk backed by objects stored in the Ceph cluster.  The images can be mapped natively by the ceph kernel module or via a driver in qemu/KVM.  Although neither of those drivers is upstream yet, the server side functionality and admin tools are in place.

Changes since v0.20 include:

  • improved logging infrastructure
  • log rotate
  • mkfs improvements
  • rbd tool, and rados class
  • mds: return ENOTEMPTY when removing directory with snapshots
  • mds: lazy io support (experimental)
  • msgr: send messages directory to connection handles (more efficient)
  • faster atomic_t via libatomic-ops
  • mon: recovery improvements, fixes (e.g. when one mon is down for a long time)
  • mon: warn on monitor clock drift
  • osd: large object support
  • osd: heartbeat improvements, fixes
  • osd: journaling fixes, improvements (bugs, better use of direct io)
  • osd: snapshot rollback op (for rbd)
  • radosgw fixes, improvements
  • many memory leaks and other bugs fixed

The project roadmap has been updated and is available via the issue tracker.

Relevant URLs:

posted by sage | 2 Comments | Tags: Releases

27 May 2010 - 10:04v0.20.2 released

We’ve released v0.20.2 with a few bug fixes.  These include

  • initscript: drop incorrect default btrfs mount option
  • initscript: behave on ksh (ubuntu)
  • monc: monitor hunting fixes
  • osd: mkfs more robust
  • cfuse: fix mount error handing
  • ppc64: fix build problems on fedora
  • mds: misc clustering fixes
  • osd: fix recovery bug

To get it:

posted by sage | No Comments | Tags: Releases

17 May 2010 - 9:57Linux v2.6.34 is out

Linux v2.6.34, which includes the Ceph kernel client, has been released!  This is an exciting milestone for us, and we’re pretty happy with the stability of the client code that made it into this release.  This should make it easier for people to experiment with Ceph and see how it holds up on a wide variety of systems.

Please note, however, that Ceph is still experimental and is not yet ready for use in a production environment.  We have made every effort to prevent the client from crashing your system, but it is still relatively young code.  The server side also has some known issues, and will need both time and testing to earn our trust.

posted by sage | 4 Comments | Tags: Releases

17 May 2010 - 9:40v0.20.1 released

We’ve released a stable update with a bunch of bug fixes since v0.20.  Notably, we’ve fixed

  • mkfs problems with osd journal file
  • librados aio api issues
  • misc osd fixes (crashes, hangs)
  • inconsistent readdir results across nodes

and lots of other small stuff.  To get it:

posted by sage | 2 Comments | Tags: Releases

30 April 2010 - 15:32v0.20 released

After a long few weeks of debugging, we’re releasing v0.20.  The goal here is to get something out prior to the v2.6.34 kernel release (which includes the Ceph client) with most of the pending improvements.  Changes since v0.19 include:

  • osd: new filestore, journaling infrastructure.  (lower latency writes, btrfs no longer strictly required)
  • msgr: wire protocol improvements
  • mds: reduced memory utilization (still more to do!)
  • auth: many auth_x cleanups and improvements
  • librados: some cleanup; C++ API now usable
  • many bug fixes throughout

There are a handful of bugs that we’ve seen but haven’t been able to reproduce reliably.  As those are fixed there will be a v0.20.1 point release.  In the meantime, work continues on v0.21.  Upcoming changes include:

  • performance improvements
  • rbd: rados block device (kvm and native linux drivers)
  • flock/fnctl lock support
  • lazy io
  • allow client reconnect even after mds has restarted (useful for clients temporarily disconnected during mds restarts)
  • cluster mds fixes

To get it:

RPMs will be included in the soon to be released Fedora 13.  There is also a ceph.spec file in git to build your own.

posted by sage | 4 Comments | Tags: Releases

17 February 2010 - 14:30v0.19 released

The v0.19 release is finally here.  The focus this past cycle was on stability and the disk format, and things have improved greatly in that area.  Our plan is to make any future disk format changes roll forward, so that users won’t need to rebuild their file systems.  The protocol has also grown feature bits that so it is at least possible to make protocol changes transparent; whether we do so or not will depend on the severity of the change and cost of maintaining compatibility.

Overall, things are looking good.  If you’ve been standing on the sidelines waiting for something more stable to test, now is a good time to try things out.  There are some lingering OSD performance problems (see below), and we are still a long ways off from something we would recommend for use in a production environment, but otherwise this release is looking pretty good for evaluation purposes.

Changes since v0.18 include:

  • Stabilized disk format, with feature bits
  • Wire protocol feature bits
  • structure encoding versioning throughout
  • msgr: code simplification, cleanup, bug fixes
  • truncation fixes
  • debian: packaging improvements
  • rados: pool deletion, misc fixes
  • osd: recovery fixes, journaling fixes
  • lots of bug fixes (osd, mds, client)

On the kernel client side of things,

  • Support for Kerberos-like ‘cephx’ authentication
  • sync/directio read/write bug fixes (multiple client access to a single file)
  • writeback congestion control
  • mds ops interruptible (with control-c)
  • Lots of code cleanup
  • Lots of bug fixes

Notably, there are major revisions underway with the way the storage daemon cosd interacts with btrfs, and these are sufficiently intrusive and untested that they did not make it into this release.  They should be in v0.20.  That means that OSD performance is still not great in v0.19.  (So far performance with the new code is much much better.)

The primary focus areas for v0.20 will be

  • OSD performance and btrfs interface changes
  • Clustered MDS

Relevant URLs:

Enjoy!

posted by sage | No Comments | Tags: Releases

4 December 2009 - 10:52v0.18 released

There’s a v0.18 release to match the latest posting of the kernel client code on the Linux email lists.  If there are no final issues there, that will be what I send to Linus for 2.6.33.

Most of the changes since v0.17 are bug fixes in the MDS and kclient.  The main other item is an authentication framework to restrict access to the cluster and it’s services to authorized clients.  Two protocols/schemes are implemented: an AUTH_NONE framework that does no real authentication (and is essentially equivalent to what we’ve had until now) and a AUTH_CEPHX scheme that uses  Kerberos-like tickets to mutually authenticate clients and services.

Changes since v0.17 include:

  • osd: basic ENOSPC handling
  • big endian fixes
  • osd: improved object -> pg hash function; selectable
  • crush: selectable hash functions
  • mds restart bug fixes
  • kclient: mds reconnect bug fixes
  • fixed mds log trimming bug
  • fixed mds cap vs snap deadlock
  • filestore: faster flushing
  • uclient,kclient: snapshot fixes
  • mds: fixed recursive accounting bug
  • uclient: fixes for 32bit clients
  • auth: ‘none’ security framework
  • mon: safely bail on write errors (e.g. ENOSPC)
  • mds: fix replay/reconnect race (causing a fast client reconnect to fail)
  • mds: misc journal replay, session fixes

There is a known memory leak in the MDS in this release.  It should be fixed in the unstable git shortly.

Looking forward, the main items are:

  • stability
  • fixing a few pressing MDS performance issues
  • improving OSD interaction with btrfs (we may switch to using btrfs snapshots in place of the user transaction ioctls)
  • stability

Relevant URLs:

posted by sage | No Comments | Tags: Releases

19 October 2009 - 15:12v0.17 released

We’ve released v0.17.  This is mainly bug fixes and some monitor improvements.  Changes since v0.16 include:

  • kclient: fix >1 mds mdsmap decoding
  • kclient: fix mon subscription renewal
  • osdmap: fix encoding bug (and resulting kclient crash)
  • msgr: simplified policy, failure model, code
  • mon: less push, more pull
  • mon: clients maintain single monitor session, requests and replies are routed by the cluster
  • mon cluster expansion works (see Monitor cluster expansion)
  • osd: fix pgid parsing bug (broke restarts on clusters with 10 osds or more)

The other change with this release is that the kernel code is no longer bundled with the server code; it lives in a separate git tree.

posted by sage | No Comments | Tags: Releases

5 October 2009 - 15:12v0.16 released

We’ve released v0.16.  The release primarily incorporates feedback on the Linux kernel client from LKML.  Changes since v0.15 include:

  • kclient: corrected inline abuse, use of __init, sockaddr_storage (IPv6 groundwork), and other feedback
  • kclient: xattr cleanups
  • kclient: fix invalidate lockup bug
  • kclient: fix msgr queue accounting lockup bug

Andrew Morton was nice enough to take some time to look at v0.15 and, “unless others emit convincing squeaks,” suggested I ask Stephen to include it in linux-next and send Linus a pull request for 2.6.33.  Yay!  With luck this will be the last version spammed to LKML in its entirety.

Meanwhile, Yehuda is continuing work on the security infrastructure to provide mutual trust between monitors, MDSs, OSDs, and clients, and Greg is working some odds and ends (monitor cluster expansion, libceph/fuse/Hadoop client improvements).

Here are the relevant URLs:

P.S. I’d like to start building up to date RPMs as well.  If anyone wants to help get ceph.spec in sync with the debian packages, that would be great.

posted by sage | No Comments | Tags: Releases

22 September 2009 - 9:49v0.15 released

We’ve released v0.15.  This is mostly cleanups for the kernel client and some work on the monitor interface.  Changes since v0.14 include:

  • kclient: message api fixups (simpler, more robust)
  • kclient: more message pools (avoiding ENOMEM)
  • kclient: new ioctl to extract object name and location/address, given a file handle and offset
  • kclient: fix with osd restart handling
  • msgr: internal interface improvements (session tracking)
  • monitor: interface/protocol cleanup, better session tracking
  • monclient: lots of fixes, improvement
  • debian: fixed permissions on headres in -dev packages; new radosgw package (S3 compatible REST interface to object store)

So nothing too groundbreaking feature wise, mostly just bug fixes and internal code cleanups.  And the radosgw package, which lets you point existing applications using the S3 storage service at a Ceph object store.

Here are the relevant URLs:

posted by sage | No Comments | Tags: Releases