<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-dev/net/ceph, branch linus/master</title>
<subtitle>Linux kernel development work - see feature branches</subtitle>
<id>https://git.zx2c4.com/linux-dev/atom/net/ceph?h=linus%2Fmaster</id>
<link rel='self' href='https://git.zx2c4.com/linux-dev/atom/net/ceph?h=linus%2Fmaster'/>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/'/>
<updated>2022-05-25T18:45:13Z</updated>
<entry>
<title>libceph: use swap() macro instead of taking tmp variable</title>
<updated>2022-05-25T18:45:13Z</updated>
<author>
<name>Guo Zhengkui</name>
<email>guozhengkui@vivo.com</email>
</author>
<published>2022-04-12T06:46:20Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=d9d58f0402a8bd14b980e6f41a5aa28aa0ca0e04'/>
<id>urn:sha1:d9d58f0402a8bd14b980e6f41a5aa28aa0ca0e04</id>
<content type='text'>
Fix the following coccicheck warning:
net/ceph/crush/mapper.c:1077:8-9: WARNING opportunity for swap()

by using swap() for the swapping of variable values and drop
the tmp variable that is not needed any more.

Signed-off-by: Guo Zhengkui &lt;guozhengkui@vivo.com&gt;
Reviewed-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
</content>
</entry>
<entry>
<title>libceph: fix misleading ceph_osdc_cancel_request() comment</title>
<updated>2022-05-18T19:21:29Z</updated>
<author>
<name>Ilya Dryomov</name>
<email>idryomov@gmail.com</email>
</author>
<published>2022-05-16T15:17:54Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=d0bb883c6355bcb2cc149fb4d5c3b28ccd327a5e'/>
<id>urn:sha1:d0bb883c6355bcb2cc149fb4d5c3b28ccd327a5e</id>
<content type='text'>
cancel_request() never guaranteed that after its return the OSD
client would be completely done with the OSD request.  The callback
(if specified) can still be invoked and a ref can still be held.

Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Reviewed-by: Xiubo Li &lt;xiubli@redhat.com&gt;
</content>
</entry>
<entry>
<title>libceph: fix potential use-after-free on linger ping and resends</title>
<updated>2022-05-18T19:21:05Z</updated>
<author>
<name>Ilya Dryomov</name>
<email>idryomov@gmail.com</email>
</author>
<published>2022-05-14T10:16:47Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=75dbb685f4e8786c33ddef8279bab0eadfb0731f'/>
<id>urn:sha1:75dbb685f4e8786c33ddef8279bab0eadfb0731f</id>
<content type='text'>
request_reinit() is not only ugly as the comment rightfully suggests,
but also unsafe.  Even though it is called with osdc-&gt;lock held for
write in all cases, resetting the OSD request refcount can still race
with handle_reply() and result in use-after-free.  Taking linger ping
as an example:

    handle_timeout thread                     handle_reply thread

                                              down_read(&amp;osdc-&gt;lock)
                                              req = lookup_request(...)
                                              ...
                                              finish_request(req)  # unregisters
                                              up_read(&amp;osdc-&gt;lock)
                                              __complete_request(req)
                                                linger_ping_cb(req)

      # req-&gt;r_kref == 2 because handle_reply still holds its ref

    down_write(&amp;osdc-&gt;lock)
    send_linger_ping(lreq)
      req = lreq-&gt;ping_req  # same req
      # cancel_linger_request is NOT
      # called - handle_reply already
      # unregistered
      request_reinit(req)
        WARN_ON(req-&gt;r_kref != 1)  # fires
        request_init(req)
          kref_init(req-&gt;r_kref)

                   # req-&gt;r_kref == 1 after kref_init

                                              ceph_osdc_put_request(req)
                                                kref_put(req-&gt;r_kref)

            # req-&gt;r_kref == 0 after kref_put, req is freed

        &lt;further req initialization/use&gt; !!!

This happens because send_linger_ping() always (re)uses the same OSD
request for watch ping requests, relying on cancel_linger_request() to
unregister it from the OSD client and rip its messages out from the
messenger.  send_linger() does the same for watch/notify registration
and watch reconnect requests.  Unfortunately cancel_request() doesn't
guarantee that after it returns the OSD client would be completely done
with the OSD request -- a ref could still be held and the callback (if
specified) could still be invoked too.

The original motivation for request_reinit() was inability to deal with
allocation failures in send_linger() and send_linger_ping().  Switching
to using osdc-&gt;req_mempool (currently only used by CephFS) respects that
and allows us to get rid of request_reinit().

Cc: stable@vger.kernel.org
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Reviewed-by: Xiubo Li &lt;xiubli@redhat.com&gt;
Acked-by: Jeff Layton &lt;jlayton@kernel.org&gt;
</content>
</entry>
<entry>
<title>libceph: disambiguate cluster/pool full log message</title>
<updated>2022-04-25T08:45:15Z</updated>
<author>
<name>Ilya Dryomov</name>
<email>idryomov@gmail.com</email>
</author>
<published>2022-03-12T10:09:34Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=dc9b0dc4561dedd44b2bf4b8e5ef1a8a040b2424'/>
<id>urn:sha1:dc9b0dc4561dedd44b2bf4b8e5ef1a8a040b2424</id>
<content type='text'>
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
</content>
</entry>
<entry>
<title>libceph: drop else branches in prepare_read_data{,_cont}</title>
<updated>2022-03-01T17:26:36Z</updated>
<author>
<name>Jeff Layton</name>
<email>jlayton@kernel.org</email>
</author>
<published>2022-02-08T13:19:45Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=27884f4bce63ebb8821992787d3e687cc24d95a3'/>
<id>urn:sha1:27884f4bce63ebb8821992787d3e687cc24d95a3</id>
<content type='text'>
Just call set_in_bvec in the non-conditional part.

Signed-off-by: Jeff Layton &lt;jlayton@kernel.org&gt;
Reviewed-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
</content>
</entry>
<entry>
<title>libceph: optionally use bounce buffer on recv path in crc mode</title>
<updated>2022-02-02T17:50:36Z</updated>
<author>
<name>Ilya Dryomov</name>
<email>idryomov@gmail.com</email>
</author>
<published>2021-12-30T14:13:32Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=038b8d1d1ab1cce11a158d30bf080ff41a2cfd15'/>
<id>urn:sha1:038b8d1d1ab1cce11a158d30bf080ff41a2cfd15</id>
<content type='text'>
Both msgr1 and msgr2 in crc mode are zero copy in the sense that
message data is read from the socket directly into the destination
buffer.  We assume that the destination buffer is stable (i.e. remains
unchanged while it is being read to) though.  Otherwise, CRC errors
ensue:

  libceph: read_partial_message 0000000048edf8ad data crc 1063286393 != exp. 228122706
  libceph: osd1 (1)192.168.122.1:6843 bad crc/signature

  libceph: bad data crc, calculated 57958023, expected 1805382778
  libceph: osd2 (2)192.168.122.1:6876 integrity error, bad crc

Introduce rxbounce option to enable use of a bounce buffer when
receiving message data.  In particular this is needed if a mapped
image is a Windows VM disk, passed to QEMU.  Windows has a system-wide
"dummy" page that may be mapped into the destination buffer (potentially
more than once into the same buffer) by the Windows Memory Manager in
an effort to generate a single large I/O [1][2].  QEMU makes a point of
preserving overlap relationships when cloning I/O vectors, so krbd gets
exposed to this behaviour.

[1] "What Is Really in That MDL?"
    https://docs.microsoft.com/en-us/previous-versions/windows/hardware/design/dn614012(v=vs.85)
[2] https://blogs.msmvps.com/kernelmustard/2005/05/04/dummy-pages/

URL: https://bugzilla.redhat.com/show_bug.cgi?id=1973317
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Reviewed-by: Jeff Layton &lt;jlayton@kernel.org&gt;
</content>
</entry>
<entry>
<title>libceph: make recv path in secure mode work the same as send path</title>
<updated>2022-02-02T17:50:36Z</updated>
<author>
<name>Ilya Dryomov</name>
<email>idryomov@gmail.com</email>
</author>
<published>2022-01-23T16:27:47Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=2ea88716369ac9a7486a8cb309d6bf1239ea156c'/>
<id>urn:sha1:2ea88716369ac9a7486a8cb309d6bf1239ea156c</id>
<content type='text'>
The recv path of secure mode is intertwined with that of crc mode.
While it's slightly more efficient that way (the ciphertext is read
into the destination buffer and decrypted in place, thus avoiding
two potentially heavy memory allocations for the bounce buffer and
the corresponding sg array), it isn't really amenable to changes.
Sacrifice that edge and align with the send path which always uses
a full-sized bounce buffer (currently there is no other way -- if
the kernel crypto API ever grows support for streaming (piecewise)
en/decryption for GCM [1], we would be able to easily take advantage
of that on both sides).

[1] https://lore.kernel.org/all/20141225202830.GA18794@gondor.apana.org.au/

Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Reviewed-by: Jeff Layton &lt;jlayton@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'ceph-for-5.17-rc1' of git://github.com/ceph/ceph-client</title>
<updated>2022-01-20T11:46:20Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2022-01-20T11:46:20Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=64f29d8856a9e0d1fcdc5344f76e70c364b941cb'/>
<id>urn:sha1:64f29d8856a9e0d1fcdc5344f76e70c364b941cb</id>
<content type='text'>
Pull ceph updates from Ilya Dryomov:
 "The highlight is the new mount "device" string syntax implemented by
  Venky Shankar. It solves some long-standing issues with using
  different auth entities and/or mounting different CephFS filesystems
  from the same cluster, remounting and also misleading /proc/mounts
  contents. The existing syntax of course remains to be maintained.

  On top of that, there is a couple of fixes for edge cases in quota and
  a new mount option for turning on unbuffered I/O mode globally instead
  of on a per-file basis with ioctl(CEPH_IOC_SYNCIO)"

* tag 'ceph-for-5.17-rc1' of git://github.com/ceph/ceph-client:
  ceph: move CEPH_SUPER_MAGIC definition to magic.h
  ceph: remove redundant Lsx caps check
  ceph: add new "nopagecache" option
  ceph: don't check for quotas on MDS stray dirs
  ceph: drop send metrics debug message
  rbd: make const pointer spaces a static const array
  ceph: Fix incorrect statfs report for small quota
  ceph: mount syntax module parameter
  doc: document new CephFS mount device syntax
  ceph: record updated mon_addr on remount
  ceph: new device mount syntax
  libceph: rename parse_fsid() to ceph_parse_fsid() and export
  libceph: generalize addr/ip parsing based on delimiter
</content>
</entry>
<entry>
<title>mm: allow !GFP_KERNEL allocations for kvmalloc</title>
<updated>2022-01-15T14:30:29Z</updated>
<author>
<name>Michal Hocko</name>
<email>mhocko@suse.com</email>
</author>
<published>2022-01-14T22:07:07Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=a421ef303008b0ceee2cfc625c3246fa7654b0ca'/>
<id>urn:sha1:a421ef303008b0ceee2cfc625c3246fa7654b0ca</id>
<content type='text'>
Support for GFP_NO{FS,IO} and __GFP_NOFAIL has been implemented by
previous patches so we can allow the support for kvmalloc.  This will
allow some external users to simplify or completely remove their
helpers.

GFP_NOWAIT semantic hasn't been supported so far but it hasn't been
explicitly documented so let's add a note about that.

ceph_kvmalloc is the first helper to be dropped and changed to kvmalloc.

Link: https://lkml.kernel.org/r/20211122153233.9924-5-mhocko@kernel.org
Signed-off-by: Michal Hocko &lt;mhocko@suse.com&gt;
Reviewed-by: Uladzislau Rezki (Sony) &lt;urezki@gmail.com&gt;
Acked-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: Dave Chinner &lt;david@fromorbit.com&gt;
Cc: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Cc: Jeff Layton &lt;jlayton@kernel.org&gt;
Cc: Neil Brown &lt;neilb@suse.de&gt;
Cc: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>libceph: rename parse_fsid() to ceph_parse_fsid() and export</title>
<updated>2022-01-13T12:40:06Z</updated>
<author>
<name>Venky Shankar</name>
<email>vshankar@redhat.com</email>
</author>
<published>2021-07-14T10:05:51Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=4153c7fc937a2afa077dbdb9fe3189b9981f423c'/>
<id>urn:sha1:4153c7fc937a2afa077dbdb9fe3189b9981f423c</id>
<content type='text'>
... as it is too generic. also, use __func__ when logging
rather than hardcoding the function name.

Signed-off-by: Venky Shankar &lt;vshankar@redhat.com&gt;
Reviewed-by: Jeff Layton &lt;jlayton@kernel.org&gt;
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
</content>
</entry>
</feed>
