<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-dev/net/ceph/ceph_common.c, branch linus/master</title>
<subtitle>Linux kernel development work - see feature branches</subtitle>
<id>https://git.zx2c4.com/linux-dev/atom/net/ceph/ceph_common.c?h=linus%2Fmaster</id>
<link rel='self' href='https://git.zx2c4.com/linux-dev/atom/net/ceph/ceph_common.c?h=linus%2Fmaster'/>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/'/>
<updated>2022-02-02T17:50:36Z</updated>
<entry>
<title>libceph: optionally use bounce buffer on recv path in crc mode</title>
<updated>2022-02-02T17:50:36Z</updated>
<author>
<name>Ilya Dryomov</name>
<email>idryomov@gmail.com</email>
</author>
<published>2021-12-30T14:13:32Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=038b8d1d1ab1cce11a158d30bf080ff41a2cfd15'/>
<id>urn:sha1:038b8d1d1ab1cce11a158d30bf080ff41a2cfd15</id>
<content type='text'>
Both msgr1 and msgr2 in crc mode are zero copy in the sense that
message data is read from the socket directly into the destination
buffer.  We assume that the destination buffer is stable (i.e. remains
unchanged while it is being read to) though.  Otherwise, CRC errors
ensue:

  libceph: read_partial_message 0000000048edf8ad data crc 1063286393 != exp. 228122706
  libceph: osd1 (1)192.168.122.1:6843 bad crc/signature

  libceph: bad data crc, calculated 57958023, expected 1805382778
  libceph: osd2 (2)192.168.122.1:6876 integrity error, bad crc

Introduce rxbounce option to enable use of a bounce buffer when
receiving message data.  In particular this is needed if a mapped
image is a Windows VM disk, passed to QEMU.  Windows has a system-wide
"dummy" page that may be mapped into the destination buffer (potentially
more than once into the same buffer) by the Windows Memory Manager in
an effort to generate a single large I/O [1][2].  QEMU makes a point of
preserving overlap relationships when cloning I/O vectors, so krbd gets
exposed to this behaviour.

[1] "What Is Really in That MDL?"
    https://docs.microsoft.com/en-us/previous-versions/windows/hardware/design/dn614012(v=vs.85)
[2] https://blogs.msmvps.com/kernelmustard/2005/05/04/dummy-pages/

URL: https://bugzilla.redhat.com/show_bug.cgi?id=1973317
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Reviewed-by: Jeff Layton &lt;jlayton@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'ceph-for-5.17-rc1' of git://github.com/ceph/ceph-client</title>
<updated>2022-01-20T11:46:20Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2022-01-20T11:46:20Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=64f29d8856a9e0d1fcdc5344f76e70c364b941cb'/>
<id>urn:sha1:64f29d8856a9e0d1fcdc5344f76e70c364b941cb</id>
<content type='text'>
Pull ceph updates from Ilya Dryomov:
 "The highlight is the new mount "device" string syntax implemented by
  Venky Shankar. It solves some long-standing issues with using
  different auth entities and/or mounting different CephFS filesystems
  from the same cluster, remounting and also misleading /proc/mounts
  contents. The existing syntax of course remains to be maintained.

  On top of that, there is a couple of fixes for edge cases in quota and
  a new mount option for turning on unbuffered I/O mode globally instead
  of on a per-file basis with ioctl(CEPH_IOC_SYNCIO)"

* tag 'ceph-for-5.17-rc1' of git://github.com/ceph/ceph-client:
  ceph: move CEPH_SUPER_MAGIC definition to magic.h
  ceph: remove redundant Lsx caps check
  ceph: add new "nopagecache" option
  ceph: don't check for quotas on MDS stray dirs
  ceph: drop send metrics debug message
  rbd: make const pointer spaces a static const array
  ceph: Fix incorrect statfs report for small quota
  ceph: mount syntax module parameter
  doc: document new CephFS mount device syntax
  ceph: record updated mon_addr on remount
  ceph: new device mount syntax
  libceph: rename parse_fsid() to ceph_parse_fsid() and export
  libceph: generalize addr/ip parsing based on delimiter
</content>
</entry>
<entry>
<title>mm: allow !GFP_KERNEL allocations for kvmalloc</title>
<updated>2022-01-15T14:30:29Z</updated>
<author>
<name>Michal Hocko</name>
<email>mhocko@suse.com</email>
</author>
<published>2022-01-14T22:07:07Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=a421ef303008b0ceee2cfc625c3246fa7654b0ca'/>
<id>urn:sha1:a421ef303008b0ceee2cfc625c3246fa7654b0ca</id>
<content type='text'>
Support for GFP_NO{FS,IO} and __GFP_NOFAIL has been implemented by
previous patches so we can allow the support for kvmalloc.  This will
allow some external users to simplify or completely remove their
helpers.

GFP_NOWAIT semantic hasn't been supported so far but it hasn't been
explicitly documented so let's add a note about that.

ceph_kvmalloc is the first helper to be dropped and changed to kvmalloc.

Link: https://lkml.kernel.org/r/20211122153233.9924-5-mhocko@kernel.org
Signed-off-by: Michal Hocko &lt;mhocko@suse.com&gt;
Reviewed-by: Uladzislau Rezki (Sony) &lt;urezki@gmail.com&gt;
Acked-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: Dave Chinner &lt;david@fromorbit.com&gt;
Cc: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Cc: Jeff Layton &lt;jlayton@kernel.org&gt;
Cc: Neil Brown &lt;neilb@suse.de&gt;
Cc: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>libceph: rename parse_fsid() to ceph_parse_fsid() and export</title>
<updated>2022-01-13T12:40:06Z</updated>
<author>
<name>Venky Shankar</name>
<email>vshankar@redhat.com</email>
</author>
<published>2021-07-14T10:05:51Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=4153c7fc937a2afa077dbdb9fe3189b9981f423c'/>
<id>urn:sha1:4153c7fc937a2afa077dbdb9fe3189b9981f423c</id>
<content type='text'>
... as it is too generic. also, use __func__ when logging
rather than hardcoding the function name.

Signed-off-by: Venky Shankar &lt;vshankar@redhat.com&gt;
Reviewed-by: Jeff Layton &lt;jlayton@kernel.org&gt;
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
</content>
</entry>
<entry>
<title>libceph: generalize addr/ip parsing based on delimiter</title>
<updated>2022-01-13T12:40:05Z</updated>
<author>
<name>Venky Shankar</name>
<email>vshankar@redhat.com</email>
</author>
<published>2021-07-14T10:05:50Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=2d7c86a8f9cdce1408c4f3c69d94d007eff2f179'/>
<id>urn:sha1:2d7c86a8f9cdce1408c4f3c69d94d007eff2f179</id>
<content type='text'>
... and remove hardcoded function name in ceph_parse_ips().

[ idryomov: delim parameter, drop CEPH_ADDR_PARSE_DEFAULT_DELIM ]

Signed-off-by: Venky Shankar &lt;vshankar@redhat.com&gt;
Reviewed-by: Jeff Layton &lt;jlayton@kernel.org&gt;
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
</content>
</entry>
<entry>
<title>libceph: remove osdtimeout option entirely</title>
<updated>2021-02-16T11:09:52Z</updated>
<author>
<name>Ilya Dryomov</name>
<email>idryomov@gmail.com</email>
</author>
<published>2021-01-22T15:50:42Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=d7ef2e59e3b908285fbbb815c4547bdba4299890'/>
<id>urn:sha1:d7ef2e59e3b908285fbbb815c4547bdba4299890</id>
<content type='text'>
Commit 83aff95eb9d6 ("libceph: remove 'osdtimeout' option") deprecated
osdtimeout over 8 years ago, but it is still recognized.  Let's remove
it entirely.

Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Reviewed-by: Jeff Layton &lt;jlayton@kernel.org&gt;
</content>
</entry>
<entry>
<title>libceph: deprecate [no]cephx_require_signatures options</title>
<updated>2021-02-16T11:09:52Z</updated>
<author>
<name>Ilya Dryomov</name>
<email>idryomov@gmail.com</email>
</author>
<published>2021-01-22T14:41:14Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=afd56e78dd179d5638333bb407d9f7da2863381a'/>
<id>urn:sha1:afd56e78dd179d5638333bb407d9f7da2863381a</id>
<content type='text'>
These options were introduced in 3.19 with support for message signing
and are rather useless, as explained in commit a51983e4dd2d ("libceph:
add nocephx_sign_messages option").  Deprecate them.

In case there is someone out there with a cluster that lacks support
for MSG_AUTH feature (very unlikely but has to be considered since we
haven't formally raised the bar from argonaut to bobtail yet), make
nocephx_sign_messages also waive MSG_AUTH requirement.  This is probably
how it should have been done in the first place -- if we aren't going
to sign, requiring the signing feature makes no sense.

Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Reviewed-by: Jeff Layton &lt;jlayton@kernel.org&gt;
</content>
</entry>
<entry>
<title>libceph: introduce connection modes and ms_mode option</title>
<updated>2020-12-14T22:21:50Z</updated>
<author>
<name>Ilya Dryomov</name>
<email>idryomov@gmail.com</email>
</author>
<published>2020-11-19T15:04:58Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=00498b994113a871a556f7ff24a4cf8a00611700'/>
<id>urn:sha1:00498b994113a871a556f7ff24a4cf8a00611700</id>
<content type='text'>
msgr2 supports two connection modes: crc (plain) and secure (on-wire
encryption).  Connection mode is picked by server based on input from
client.

Introduce ms_mode option:

  ms_mode=legacy        - msgr1 (default)
  ms_mode=crc           - crc mode, if denied fail
  ms_mode=secure        - secure mode, if denied fail
  ms_mode=prefer-crc    - crc mode, if denied agree to secure mode
  ms_mode=prefer-secure - secure mode, if denied agree to crc mode

ms_mode affects all connections, we don't separate connections to mons
like it's done in userspace with ms_client_mode vs ms_mon_client_mode.

For now the default is legacy, to be flipped to prefer-crc after some
time.

Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
</content>
</entry>
<entry>
<title>libceph: move away from global osd_req_flags</title>
<updated>2020-06-16T14:01:53Z</updated>
<author>
<name>Ilya Dryomov</name>
<email>idryomov@gmail.com</email>
</author>
<published>2020-06-04T09:12:34Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=22d2cfdffa5bff3566e16cb7320e13ceb814674b'/>
<id>urn:sha1:22d2cfdffa5bff3566e16cb7320e13ceb814674b</id>
<content type='text'>
osd_req_flags is overly general and doesn't suit its only user
(read_from_replica option) well:

- applying osd_req_flags in account_request() affects all OSD
  requests, including linger (i.e. watch and notify).  However,
  linger requests should always go to the primary even though
  some of them are reads (e.g. notify has side effects but it
  is a read because it doesn't result in mutation on the OSDs).

- calls to class methods that are reads are allowed to go to
  the replica, but most such calls issued for "rbd map" and/or
  exclusive lock transitions are requested to be resent to the
  primary via EAGAIN, doubling the latency.

Get rid of global osd_req_flags and set read_from_replica flag
only on specific OSD requests instead.

Fixes: 8ad44d5e0d1e ("libceph: read_from_replica option")
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Reviewed-by: Jeff Layton &lt;jlayton@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'ceph-for-5.8-rc1' of git://github.com/ceph/ceph-client</title>
<updated>2020-06-08T19:49:18Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2020-06-08T19:49:18Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=95288a9b3beee8dd69d73b7691e36f2f231b7903'/>
<id>urn:sha1:95288a9b3beee8dd69d73b7691e36f2f231b7903</id>
<content type='text'>
Pull ceph updates from Ilya Dryomov:
 "The highlights are:

   - OSD/MDS latency and caps cache metrics infrastructure for the
     filesytem (Xiubo Li). Currently available through debugfs and will
     be periodically sent to the MDS in the future.

   - support for replica reads (balanced and localized reads) for rbd
     and the filesystem (myself). The default remains to always read
     from primary, users can opt-in with the new crush_location and
     read_from_replica options. Note that reading from replica is safe
     for general use only since Octopus.

   - support for RADOS allocation hint flags (myself). Currently used by
     rbd to propagate the compressible/incompressible hint given with
     the new compression_hint map option and ready for passing on more
     advanced hints, e.g. based on fadvise() from the filesystem.

   - support for efficient cross-quota-realm renames (Luis Henriques)

   - assorted cap handling improvements and cleanups, particularly
     untangling some of the locking (Jeff Layton)"

* tag 'ceph-for-5.8-rc1' of git://github.com/ceph/ceph-client: (29 commits)
  rbd: compression_hint option
  libceph: support for alloc hint flags
  libceph: read_from_replica option
  libceph: support for balanced and localized reads
  libceph: crush_location infrastructure
  libceph: decode CRUSH device/bucket types and names
  libceph: add non-asserting rbtree insertion helper
  ceph: skip checking caps when session reconnecting and releasing reqs
  ceph: make sure mdsc-&gt;mutex is nested in s-&gt;s_mutex to fix dead lock
  ceph: don't return -ESTALE if there's still an open file
  libceph, rbd: replace zero-length array with flexible-array
  ceph: allow rename operation under different quota realms
  ceph: normalize 'delta' parameter usage in check_quota_exceeded
  ceph: ceph_kick_flushing_caps needs the s_mutex
  ceph: request expedited service on session's last cap flush
  ceph: convert mdsc-&gt;cap_dirty to a per-session list
  ceph: reset i_requested_max_size if file write is not wanted
  ceph: throw a warning if we destroy session with mutex still locked
  ceph: fix potential race in ceph_check_caps
  ceph: document what protects i_dirty_item and i_flushing_item
  ...
</content>
</entry>
</feed>
