<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-dev/drivers/pci/controller/pci-hyperv.c, branch linus/master</title>
<subtitle>Linux kernel development work - see feature branches</subtitle>
<id>https://git.zx2c4.com/linux-dev/atom/drivers/pci/controller/pci-hyperv.c?h=linus%2Fmaster</id>
<link rel='self' href='https://git.zx2c4.com/linux-dev/atom/drivers/pci/controller/pci-hyperv.c?h=linus%2Fmaster'/>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/'/>
<updated>2022-05-13T16:57:32Z</updated>
<entry>
<title>PCI: hv: Fix synchronization between channel callback and hv_pci_bus_exit()</title>
<updated>2022-05-13T16:57:32Z</updated>
<author>
<name>Andrea Parri (Microsoft)</name>
<email>parri.andrea@gmail.com</email>
</author>
<published>2022-05-11T22:32:07Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=b4927bd272623694314f37823302f9d67aa5964c'/>
<id>urn:sha1:b4927bd272623694314f37823302f9d67aa5964c</id>
<content type='text'>
[ Similarly to commit a765ed47e4516 ("PCI: hv: Fix synchronization
  between channel callback and hv_compose_msi_msg()"): ]

The (on-stack) teardown packet becomes invalid once the completion
timeout in hv_pci_bus_exit() has expired and hv_pci_bus_exit() has
returned.  Prevent the channel callback from accessing the invalid
packet by removing the ID associated to such packet from the VMbus
requestor in hv_pci_bus_exit().

Signed-off-by: Andrea Parri (Microsoft) &lt;parri.andrea@gmail.com&gt;
Reviewed-by: Michael Kelley &lt;mikelley@microsoft.com&gt;
Acked-by: Lorenzo Pieralisi &lt;lorenzo.pieralisi@arm.com&gt;
Link: https://lore.kernel.org/r/20220511223207.3386-3-parri.andrea@gmail.com
Signed-off-by: Wei Liu &lt;wei.liu@kernel.org&gt;
</content>
</entry>
<entry>
<title>PCI: hv: Add validation for untrusted Hyper-V values</title>
<updated>2022-05-13T16:57:32Z</updated>
<author>
<name>Andrea Parri (Microsoft)</name>
<email>parri.andrea@gmail.com</email>
</author>
<published>2022-05-11T22:32:06Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=9937fa6d1eb6fac95586970e17617a718919c858'/>
<id>urn:sha1:9937fa6d1eb6fac95586970e17617a718919c858</id>
<content type='text'>
For additional robustness in the face of Hyper-V errors or malicious
behavior, validate all values that originate from packets that Hyper-V
has sent to the guest in the host-to-guest ring buffer.  Ensure that
invalid values cannot cause data being copied out of the bounds of the
source buffer in hv_pci_onchannelcallback().

While at it, remove a redundant validation in hv_pci_generic_compl():
hv_pci_onchannelcallback() already ensures that all processed incoming
packets are "at least as large as [in fact larger than] a response".

Signed-off-by: Andrea Parri (Microsoft) &lt;parri.andrea@gmail.com&gt;
Reviewed-by: Michael Kelley &lt;mikelley@microsoft.com&gt;
Acked-by: Lorenzo Pieralisi &lt;lorenzo.pieralisi@arm.com&gt;
Link: https://lore.kernel.org/r/20220511223207.3386-2-parri.andrea@gmail.com
Signed-off-by: Wei Liu &lt;wei.liu@kernel.org&gt;
</content>
</entry>
<entry>
<title>PCI: hv: Fix interrupt mapping for multi-MSI</title>
<updated>2022-05-11T17:51:02Z</updated>
<author>
<name>Jeffrey Hugo</name>
<email>quic_jhugo@quicinc.com</email>
</author>
<published>2022-05-11T15:23:19Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=a2bad844a67b1c7740bda63e87453baf63c3a7f7'/>
<id>urn:sha1:a2bad844a67b1c7740bda63e87453baf63c3a7f7</id>
<content type='text'>
According to Dexuan, the hypervisor folks beleive that multi-msi
allocations are not correct.  compose_msi_msg() will allocate multi-msi
one by one.  However, multi-msi is a block of related MSIs, with alignment
requirements.  In order for the hypervisor to allocate properly aligned
and consecutive entries in the IOMMU Interrupt Remapping Table, there
should be a single mapping request that requests all of the multi-msi
vectors in one shot.

Dexuan suggests detecting the multi-msi case and composing a single
request related to the first MSI.  Then for the other MSIs in the same
block, use the cached information.  This appears to be viable, so do it.

Suggested-by: Dexuan Cui &lt;decui@microsoft.com&gt;
Signed-off-by: Jeffrey Hugo &lt;quic_jhugo@quicinc.com&gt;
Reviewed-by: Dexuan Cui &lt;decui@microsoft.com&gt;
Tested-by: Michael Kelley &lt;mikelley@microsoft.com&gt;
Link: https://lore.kernel.org/r/1652282599-21643-1-git-send-email-quic_jhugo@quicinc.com
Signed-off-by: Wei Liu &lt;wei.liu@kernel.org&gt;
</content>
</entry>
<entry>
<title>PCI: hv: Reuse existing IRTE allocation in compose_msi_msg()</title>
<updated>2022-05-11T17:50:20Z</updated>
<author>
<name>Jeffrey Hugo</name>
<email>quic_jhugo@quicinc.com</email>
</author>
<published>2022-05-11T15:23:02Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=b4b77778ecc5bfbd4e77de1b2fd5c1dd3c655f1f'/>
<id>urn:sha1:b4b77778ecc5bfbd4e77de1b2fd5c1dd3c655f1f</id>
<content type='text'>
Currently if compose_msi_msg() is called multiple times, it will free any
previous IRTE allocation, and generate a new allocation.  While nothing
prevents this from occurring, it is extraneous when Linux could just reuse
the existing allocation and avoid a bunch of overhead.

However, when future IRTE allocations operate on blocks of MSIs instead of
a single line, freeing the allocation will impact all of the lines.  This
could cause an issue where an allocation of N MSIs occurs, then some of
the lines are retargeted, and finally the allocation is freed/reallocated.
The freeing of the allocation removes all of the configuration for the
entire block, which requires all the lines to be retargeted, which might
not happen since some lines might already be unmasked/active.

Signed-off-by: Jeffrey Hugo &lt;quic_jhugo@quicinc.com&gt;
Reviewed-by: Dexuan Cui &lt;decui@microsoft.com&gt;
Tested-by: Dexuan Cui &lt;decui@microsoft.com&gt;
Tested-by: Michael Kelley &lt;mikelley@microsoft.com&gt;
Link: https://lore.kernel.org/r/1652282582-21595-1-git-send-email-quic_jhugo@quicinc.com
Signed-off-by: Wei Liu &lt;wei.liu@kernel.org&gt;
</content>
</entry>
<entry>
<title>PCI: hv: Do not set PCI_COMMAND_MEMORY to reduce VM boot time</title>
<updated>2022-05-03T10:59:10Z</updated>
<author>
<name>Dexuan Cui</name>
<email>decui@microsoft.com</email>
</author>
<published>2022-05-02T07:42:55Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=23e118a48acf7be223e57d98e98da8ac5a4071ac'/>
<id>urn:sha1:23e118a48acf7be223e57d98e98da8ac5a4071ac</id>
<content type='text'>
Currently when the pci-hyperv driver finishes probing and initializing the
PCI device, it sets the PCI_COMMAND_MEMORY bit; later when the PCI device
is registered to the core PCI subsystem, the core PCI driver's BAR detection
and initialization code toggles the bit multiple times, and each toggling of
the bit causes the hypervisor to unmap/map the virtual BARs from/to the
physical BARs, which can be slow if the BAR sizes are huge, e.g., a Linux VM
with 14 GPU devices has to spend more than 3 minutes on BAR detection and
initialization, causing a long boot time.

Reduce the boot time by not setting the PCI_COMMAND_MEMORY bit when we
register the PCI device (there is no need to have it set in the first place).
The bit stays off till the PCI device driver calls pci_enable_device().
With this change, the boot time of such a 14-GPU VM is reduced by almost
3 minutes.

Link: https://lore.kernel.org/lkml/20220419220007.26550-1-decui@microsoft.com/
Tested-by: Boqun Feng (Microsoft) &lt;boqun.feng@gmail.com&gt;
Signed-off-by: Dexuan Cui &lt;decui@microsoft.com&gt;
Reviewed-by: Michael Kelley &lt;mikelley@microsoft.com&gt;
Acked-by: Lorenzo Pieralisi &lt;lorenzo.pieralisi@arm.com&gt;
Cc: Jake Oshins &lt;jakeo@microsoft.com&gt;
Link: https://lore.kernel.org/r/20220502074255.16901-1-decui@microsoft.com
Signed-off-by: Wei Liu &lt;wei.liu@kernel.org&gt;
</content>
</entry>
<entry>
<title>PCI: hv: Fix hv_arch_irq_unmask() for multi-MSI</title>
<updated>2022-04-28T15:09:02Z</updated>
<author>
<name>Jeffrey Hugo</name>
<email>quic_jhugo@quicinc.com</email>
</author>
<published>2022-04-27T14:07:33Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=455880dfe292a2bdd3b4ad6a107299fce610e64b'/>
<id>urn:sha1:455880dfe292a2bdd3b4ad6a107299fce610e64b</id>
<content type='text'>
In the multi-MSI case, hv_arch_irq_unmask() will only operate on the first
MSI of the N allocated.  This is because only the first msi_desc is cached
and it is shared by all the MSIs of the multi-MSI block.  This means that
hv_arch_irq_unmask() gets the correct address, but the wrong data (always
0).

This can break MSIs.

Lets assume MSI0 is vector 34 on CPU0, and MSI1 is vector 33 on CPU0.

hv_arch_irq_unmask() is called on MSI0.  It uses a hypercall to configure
the MSI address and data (0) to vector 34 of CPU0.  This is correct.  Then
hv_arch_irq_unmask is called on MSI1.  It uses another hypercall to
configure the MSI address and data (0) to vector 33 of CPU0.  This is
wrong, and results in both MSI0 and MSI1 being routed to vector 33.  Linux
will observe extra instances of MSI1 and no instances of MSI0 despite the
endpoint device behaving correctly.

For the multi-MSI case, we need unique address and data info for each MSI,
but the cached msi_desc does not provide that.  However, that information
can be gotten from the int_desc cached in the chip_data by
compose_msi_msg().  Fix the multi-MSI case to use that cached information
instead.  Since hv_set_msi_entry_from_desc() is no longer applicable,
remove it.

Signed-off-by: Jeffrey Hugo &lt;quic_jhugo@quicinc.com&gt;
Reviewed-by: Michael Kelley &lt;mikelley@microsoft.com&gt;
Link: https://lore.kernel.org/r/1651068453-29588-1-git-send-email-quic_jhugo@quicinc.com
Signed-off-by: Wei Liu &lt;wei.liu@kernel.org&gt;
</content>
</entry>
<entry>
<title>PCI: hv: Fix synchronization between channel callback and hv_compose_msi_msg()</title>
<updated>2022-04-25T15:51:13Z</updated>
<author>
<name>Andrea Parri (Microsoft)</name>
<email>parri.andrea@gmail.com</email>
</author>
<published>2022-04-19T12:23:25Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=a765ed47e45166451680ee9af2b9e435c82ec3ba'/>
<id>urn:sha1:a765ed47e45166451680ee9af2b9e435c82ec3ba</id>
<content type='text'>
Dexuan wrote:

  "[...]  when we disable AccelNet, the host PCI VSP driver sends a
   PCI_EJECT message first, and the channel callback may set
   hpdev-&gt;state to hv_pcichild_ejecting on a different CPU.  This can
   cause hv_compose_msi_msg() to exit from the loop and 'return', and
   the on-stack variable 'ctxt' is invalid.  Now, if the response
   message from the host arrives, the channel callback will try to
   access the invalid 'ctxt' variable, and this may cause a crash."

Schematically:

  Hyper-V sends PCI_EJECT msg
    hv_pci_onchannelcallback()
      state = hv_pcichild_ejecting
                                       hv_compose_msi_msg()
                                         alloc and init comp_pkt
                                         state == hv_pcichild_ejecting
  Hyper-V sends VM_PKT_COMP msg
    hv_pci_onchannelcallback()
      retrieve address of comp_pkt
                                         'free' comp_pkt and return
      comp_pkt-&gt;completion_func()

Dexuan also showed how the crash can be triggered after introducing
suitable delays in the driver code, thus validating the 'assumption'
that the host can still normally respond to the guest's compose_msi
request after the host has started to eject the PCI device.

Fix the synchronization by leveraging the requestor lock as follows:

  - Before 'return'-ing in hv_compose_msi_msg(), remove the ID (while
    holding the requestor lock) associated to the completion packet.

  - Retrieve the address *and call -&gt;completion_func() within a same
    (requestor) critical section in hv_pci_onchannelcallback().

Reported-by: Wei Hu &lt;weh@microsoft.com&gt;
Reported-by: Dexuan Cui &lt;decui@microsoft.com&gt;
Suggested-by: Michael Kelley &lt;mikelley@microsoft.com&gt;
Signed-off-by: Andrea Parri (Microsoft) &lt;parri.andrea@gmail.com&gt;
Reviewed-by: Michael Kelley &lt;mikelley@microsoft.com&gt;
Link: https://lore.kernel.org/r/20220419122325.10078-7-parri.andrea@gmail.com
Signed-off-by: Wei Liu &lt;wei.liu@kernel.org&gt;
</content>
</entry>
<entry>
<title>PCI: hv: Use vmbus_requestor to generate transaction IDs for VMbus hardening</title>
<updated>2022-04-25T15:51:12Z</updated>
<author>
<name>Andrea Parri (Microsoft)</name>
<email>parri.andrea@gmail.com</email>
</author>
<published>2022-04-19T12:23:21Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=de5ddb7d44347ad8b00533c1850a4e2e636a1ce9'/>
<id>urn:sha1:de5ddb7d44347ad8b00533c1850a4e2e636a1ce9</id>
<content type='text'>
Currently, pointers to guest memory are passed to Hyper-V as transaction
IDs in hv_pci.  In the face of errors or malicious behavior in Hyper-V,
hv_pci should not expose or trust the transaction IDs returned by
Hyper-V to be valid guest memory addresses.  Instead, use small integers
generated by vmbus_requestor as request (transaction) IDs.

Suggested-by: Michael Kelley &lt;mikelley@microsoft.com&gt;
Signed-off-by: Andrea Parri (Microsoft) &lt;parri.andrea@gmail.com&gt;
Reviewed-by: Michael Kelley &lt;mikelley@microsoft.com&gt;
Link: https://lore.kernel.org/r/20220419122325.10078-3-parri.andrea@gmail.com
Signed-off-by: Wei Liu &lt;wei.liu@kernel.org&gt;
</content>
</entry>
<entry>
<title>PCI: hv: Fix multi-MSI to allow more than one MSI vector</title>
<updated>2022-04-25T15:50:17Z</updated>
<author>
<name>Jeffrey Hugo</name>
<email>quic_jhugo@quicinc.com</email>
</author>
<published>2022-04-13T13:36:21Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=08e61e861a0e47e5e1a3fb78406afd6b0cea6b6d'/>
<id>urn:sha1:08e61e861a0e47e5e1a3fb78406afd6b0cea6b6d</id>
<content type='text'>
If the allocation of multiple MSI vectors for multi-MSI fails in the core
PCI framework, the framework will retry the allocation as a single MSI
vector, assuming that meets the min_vecs specified by the requesting
driver.

Hyper-V advertises that multi-MSI is supported, but reuses the VECTOR
domain to implement that for x86.  The VECTOR domain does not support
multi-MSI, so the alloc will always fail and fallback to a single MSI
allocation.

In short, Hyper-V advertises a capability it does not implement.

Hyper-V can support multi-MSI because it coordinates with the hypervisor
to map the MSIs in the IOMMU's interrupt remapper, which is something the
VECTOR domain does not have.  Therefore the fix is simple - copy what the
x86 IOMMU drivers (AMD/Intel-IR) do by removing
X86_IRQ_ALLOC_CONTIGUOUS_VECTORS after calling the VECTOR domain's
pci_msi_prepare().

Fixes: 4daace0d8ce8 ("PCI: hv: Add paravirtual PCI front-end for Microsoft Hyper-V VMs")
Signed-off-by: Jeffrey Hugo &lt;quic_jhugo@quicinc.com&gt;
Reviewed-by: Dexuan Cui &lt;decui@microsoft.com&gt;
Link: https://lore.kernel.org/r/1649856981-14649-1-git-send-email-quic_jhugo@quicinc.com
Signed-off-by: Wei Liu &lt;wei.liu@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'hyperv-fixes-signed-20220407' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux</title>
<updated>2022-04-07T16:35:34Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2022-04-07T16:35:34Z</published>
<link rel='alternate' type='text/html' href='https://git.zx2c4.com/linux-dev/commit/?id=42e7a03d3badebd4e70aea5362d6914dfc7c220b'/>
<id>urn:sha1:42e7a03d3badebd4e70aea5362d6914dfc7c220b</id>
<content type='text'>
Pull hyperv fixes from Wei Liu:

 - Correctly propagate coherence information for VMbus devices (Michael
   Kelley)

 - Disable balloon and memory hot-add on ARM64 temporarily (Boqun Feng)

 - Use barrier to prevent reording when reading ring buffer (Michael
   Kelley)

 - Use virt_store_mb in favour of smp_store_mb (Andrea Parri)

 - Fix VMbus device object initialization (Andrea Parri)

 - Deactivate sysctl_record_panic_msg on isolated guest (Andrea Parri)

 - Fix a crash when unloading VMbus module (Guilherme G. Piccoli)

* tag 'hyperv-fixes-signed-20220407' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
  Drivers: hv: vmbus: Replace smp_store_mb() with virt_store_mb()
  Drivers: hv: balloon: Disable balloon and hot-add accordingly
  Drivers: hv: balloon: Support status report for larger page sizes
  Drivers: hv: vmbus: Prevent load re-ordering when reading ring buffer
  PCI: hv: Propagate coherence from VMbus device to PCI device
  Drivers: hv: vmbus: Propagate VMbus coherence to each VMbus device
  Drivers: hv: vmbus: Fix potential crash on module unload
  Drivers: hv: vmbus: Fix initialization of device object in vmbus_device_register()
  Drivers: hv: vmbus: Deactivate sysctl_record_panic_msg by default in isolated guests
</content>
</entry>
</feed>
