aboutsummaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2014-04-24igb: Cleanups to fix for trailing statementCarolyn Wyborny1-2/+2
This patch fixes WARNING:TRAILING_STATEMENT from checkpatch file check. Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-04-24igb: Cleanups to fix pointer location errorCarolyn Wyborny1-1/+1
This patch fixes ERROR:POINTER_LOCATION from checkpatch file check. Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-04-24Altera TSE: Remove unnecessary cast of void pointersTobias Klauser2-30/+22
No need to cast void pointers. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-24vxlan: add x-netns supportNicolas Dichtel3-18/+50
This patch allows to switch the netns when packet is encapsulated or decapsulated. The vxlan socket is openned into the i/o netns, ie into the netns where encapsulated packets are received. The socket lookup is done into this netns to find the corresponding vxlan tunnel. After decapsulation, the packet is injecting into the corresponding interface which may stand to another netns. When one of the two netns is removed, the tunnel is destroyed. Configuration example: ip netns add netns1 ip netns exec netns1 ip link set lo up ip link add vxlan10 type vxlan id 10 group 239.0.0.10 dev eth0 dstport 0 ip link set vxlan10 netns netns1 ip netns exec netns1 ip addr add 192.168.0.249/24 broadcast 192.168.0.255 dev vxlan10 ip netns exec netns1 ip link set vxlan10 up Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-24bcm63xx_enet: Use ARRAY_SIZE instead of open coding itTobias Klauser1-2/+1
Use the ARRAY_SIZE macro to determine the number of entries in bcm_enet_gstrings_stats. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller29-158/+157
Conflicts: drivers/net/ethernet/intel/igb/e1000_mac.c net/core/filter.c Both conflicts were simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-24smc91x: improve definition of debug macrosZi Shen Lim1-11/+12
Redefine some macros that were conditioned upon SMC_DEBUG level. By allowing compiler to verify parameters used by these macros unconditionally, we can flag compilation failures. Compiler will still optimize out the unused code path depending on SMC_DEBUG, so this is a net gain. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Zi Shen Lim <zlim.lnx@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-24bonding: Add tlb_dynamic_lb parameter for tlb modeMahesh Bandewar7-12/+111
The aggresive load balancing causes packet re-ordering as active flows are moved from a slave to another within the group. Sometime this aggresive lb is not necessary if the preference is for less re-ordering. This parameter if used with value "0" disables this dynamic flow shuffling minimizing packet re-ordering. Of course the side effect is that it has to live with the static load balancing that the hashing distribution provides. This impact is less severe if the correct xmit-hashing-policy is used for the tlb setup. The default value of the parameter is set to "1" mimicing the earlier behavior. Ran the netperf test with 200 stream for 1 min between two hosts with 4x1G trunk (xmit-lb mode with xmit-policy L3+4) before and after these changes. Following was the command used for those 200 instances - netperf -t TCP_RR -l 60 -s 5 -H <host> -- -r81920,81920 Transactions per second: Before change: 1,367.11 After change: 1,470.65 Change-Id: Ie3f75c77282cf602e83a6e833c6eb164e72a0990 Signed-off-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-24bonding: Added bond_tlb_xmit() for tlb mode.Mahesh Bandewar5-4/+33
Re-organized the xmit function for the lb mode separating tlb xmit from the alb mode. This will enable use of the hashing policies like 802.3ad mode. Also extended use of xmit-hash-policy to tlb mode. Now the tlb-mode defaults to BOND_XMIT_POLICY_LAYER2 if the xmit policy module parameter is not set (just like 802.3ad, or Xor mode). Change-Id: I140257403d272df75f477b380207338d0f04963e Signed-off-by: Mahesh Bandewar <maheshb@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-24bonding: Reorg bond_alb_xmit codeMahesh Bandewar1-35/+44
Separating the actual xmit part from the function in a separate function that can be used in the tlb_xmit in the next patch. Also there is no reason do_tx_balance to be an int so changing it to bool type. Change-Id: I9c48ff30487810f68587e621a191db616f49bd3b Signed-off-by: Mahesh Bandewar <maheshb@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-24bonding: Changed hashing function to just provide hashMahesh Bandewar3-8/+6
Modified the hash function to return just hash separating from the modulo operation that can be performed by the caller. This is to make way for the tlb mode to use the same hashing policies that are used in the 802.3ad and Xor mode. Change-Id: I276609e87e0ca213c4d1b17b79c5e0b0f3d0dd6f Signed-off-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-246lowpan: nuke net_ieee802154_lowpan() accessor when 6lowpan is disabledLuis R. Rodriguez1-7/+0
Johannes noted this is not needed, all of the fragment accessors don't need CONFIG_NET_NS. This goes test compiled with CONFIG_BT_6LOWPAN=y and a disabled CONFIG_NET_NS. CC: Alexander Smirnov <alex.bluesman.smirnov@gmail.com> Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com> Cc: linux-zigbee-devel@lists.sourceforge.net Cc: David S. Miller" <davem@davemloft.net> Cc: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-23Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-nextDavid S. Miller32-292/+478
Jeff Kirsher says: ==================== Intel Wired LAN Driver Updates This series contains updates to ixgbe, ixgbevf, e1000e, igb and i40e. Jacob converts the ixgbe low_water into an array which allows the algorithm to output different values for different TCs and we can distinguish between them. Removes vlan_filter_disable() and vlan_filter_enable() in ixgbe so that we can do the work directly in set_rx_mode(). Changes the setting of multicast filters only when the interface is not in promiscuous mode for multicast packets in ixgbe. Improves MAC filter handling by adding mac_table API based on work done for igb, which includes functions to add/delete MAC filters. Mark changes register reads in ixgbe to an out-of-line function since register reads are slow. Emil provides a ixgbevf patch to update the driver description since it supports more than just 82599 parts now. David provides several cleanup patches for e1000e which resolve some checkpatch issues as well as changing occurrences of returning 0 or 1 in bool functions to returning true false or true. Carolyn provides several cleanup patches for igb which fix checkpatch warnings. Mitch provides a fix for i40evf where the driver would correctly allow the virtual function link state to be controlled by 'ip set link', but would not report it correctly back. This is fixed by filling out the appropriate field in the VF info struct. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-23net: filter: initialize A and X registersAlexei Starovoitov1-7/+9
exisiting BPF verifier allows uninitialized access to registers, 'ret A' is considered to be a valid filter. So initialize A and X to zero to prevent leaking kernel memory In the future BPF verifier will be rejecting such filters Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Cc: Daniel Borkmann <dborkman@redhat.com> Acked-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-23net/phy: Remove return value for void functionShruti Kanetkar3-6/+3
This was caught when using a spatch (aka. coccinelle) script written by Joe Perches. Cc: Joe Perches <joe@perches.com> Signed-off-by: Shruti Kanetkar <Shruti@Freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-23Merge branch 'via-rhine'David S. Miller6-179/+303
Alexey Charkov says: ==================== net: via-rhine: add support for on-chip Rhine controllers This series introduces platform bus (OpenFirmware) binding for via-rhine, as used in various ARM-based Systems-on-Chip by VIA/WonderMedia. This has been tested in OF configuration by myself on a WM8950-based VIA APC Rock development board and on a WM8850-based netbook, and in PCI configuration by Roger. Please note that the initial version of these patches was signed off by Roger, but some time has passed since then, so I'm not including his sign-off until explicit notice. Changes since v1: - Fixed indentation of function arguments - Switched to 'dev_is_pci' instead of string comparison on bus name - Dropped 'rhine,revision' DT attribute, put the revision into OF match table instead - Included actual device tree nodes where applicable ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-23net: via-rhine: add OF bus bindingAlexey Charkov6-115/+229
This should make the driver usable with VIA/WonderMedia ARM-based Systems-on-Chip integrated Rhine III adapters. Note that these are always in MMIO mode, and don't have any known EEPROM. Signed-off-by: Alexey Charkov <alchark@gmail.com> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-23net: via-rhine: reduce usage of the PCI-specific structAlexey Charkov1-54/+62
Use more generic data structures instead of struct pci_dev wherever possible in preparation for OF bus binding Signed-off-by: Alexey Charkov <alchark@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-23net: via-rhine: switch to generic DMA functionsAlexey Charkov1-41/+43
Remove legacy PCI DMA wrappers and instead use generic DMA functions directly in preparation for OF bus binding Signed-off-by: Alexey Charkov <alchark@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-23net: Update my email addressBen Hutchings1-1/+1
Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-23vxlan: ensure to advertise the right fdb remoteNicolas Dichtel1-17/+21
The goal of this patch is to fix rtnelink notification. The main problem was about notification for fdb entry with more than one remote. Before the patch, when a remote was added to an existing fdb entry, the kernel advertised the first remote instead of the added one. Also when a remote was removed from a fdb entry with several remotes, the deleted remote was not advertised. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-23net/phy: micrel: fix bugged test on device tree loading for ksz9021Hubert Chaumette1-3/+3
In ksz9021_load_values_from_of() val2 to val4 aren't tested against their initialization value. This causes the test to always succeed, and this value to be used as if it was loaded from the devicetree instead of being ignored, in case of a missing/invalid property in the ethernet OF device node. As a result, the value "0" is written to the relevant registers. Change the conditions to test against the right initialization value. Signed-off-by: Hubert Chaumette <hchaumette@adeneo-embedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-23macvlan: Fix leak and NULL dereference on error pathHerbert Xu1-4/+7
The recent patch that moved broadcasts to process context added a couple of bugs on the error path where we may dereference NULL or leak an skb. This patch fixes them. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-23Merge branch 'gre_netns'David S. Miller2-23/+35
Nicolas Dichtel says: ==================== gre: allow to switch netns during encap/decap The goal of this serie is to add x-netns support for the module ip_gre and ip6_gre, ie the encapsulation addresses and the network device are not owned by the same namespace. Example to configure an ipv4 gre tunnel: modprobe ip_gre ip netns add netns1 ip netns exec netns1 ip link set lo up ip link add gre1 type gre remote 10.16.0.121 local 10.16.0.249 ikey 10 okey 10 csum ip link set gre1 netns netns1 ip netns exec netns1 ip link set gre1 up ip netns exec netns1 ip addr add dev gre1 192.168.0.249 remote 192.168.0.121 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-23ip6gre: add x-netns supportNicolas Dichtel1-20/+32
This patch allows to switch the netns when packet is encapsulated or decapsulated. In other word, the encapsulated packet is received in a netns, where the lookup is done to find the tunnel. Once the tunnel is found, the packet is decapsulated and injecting into the corresponding interface which stands to another netns. When one of the two netns is removed, the tunnel is destroyed. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-23gre: add x-netns supportNicolas Dichtel1-3/+3
This patch allows to switch the netns when packet is encapsulated or decapsulated. In other word, the encapsulated packet is received in a netns, where the lookup is done to find the tunnel. Once the tunnel is found, the packet is decapsulated and injecting into the corresponding interface which stands to another netns. When one of the two netns is removed, the tunnel is destroyed. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-23hyperv: Simplify the send_completion variablesHaiyang Zhang4-16/+11
The union contains only one member now, so we use the variables in it directly. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-23hyperv: Remove recv_pkt_list and lockHaiyang Zhang4-198/+13
Removed recv_pkt_list and lock, and updated related code, so that the locking overhead is reduced especially when multiple channels are in use. The recv_pkt_list isn't actually necessary because the packets are processed sequentially in each channel. It has been replaced by a local variable, and the related lock for this list is also removed. The is_data_pkt field is not used in receive path, so its assignment is cleaned up. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-23i40e: report VF link state correctlyMitch Williams1-0/+7
Although the driver would correctly allow the VF link state to be controlled by 'ip set link', it would not report it correctly back. Fix this by filling out the appropriate field in the vf info struct. Change-ID: I58d8e356438190e1ee9660b424301af6f416cdbe Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com> Tested-by: Sibai Li <sibai.li@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-04-23igb: Cleanups to fix incorrect indentationCarolyn Wyborny11-75/+95
This patch fixes WARNING:LEADING_SPACE, WARNING:SPACING, ERROR:SPACING, WARNING:SPACE_BEFORE_TAB and ERROR_CODE_INDENT from checkpatch file check. Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-04-23igb: Cleanups to fix braces location warningsCarolyn Wyborny3-13/+9
This patch fixes WARNING:BRACES and ERROR:OPEN_BRACE from checkpatch file check. Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-04-23igb: Cleanups for messagingCarolyn Wyborny4-33/+18
This patch fixes WARNING:PREFER_PR_LEVEL and WARNING:SPLIT_STRING from checkpatch file check. Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-04-23e1000e: Cleanup use of deprecated DEFINE_PCI_DEVICE_TABLEDavid Ertman1-1/+1
Signed-off-by: Dave Ertman <davidx.m.ertman@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-04-23e1000e: Cleanup checkpatch extra spaceDavid Ertman1-4/+4
Fixing "WARNING:SPACING: Unnecessary space before function pointer arguments" Signed-off-by: Dave Ertman <davidx.m.ertman@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-04-23e1000e: Cleanup to fix checkpatch missing blank linesDavid Ertman6-0/+23
Fixing "WARNING:SPACING: networking uses a blank line after declarations" Signed-off-by: Dave Ertman <davidx.m.ertman@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-04-23e1000e: Cleanup return values in ethtoolDavid Ertman2-6/+6
Changing occurrences of returning 0 and 1 from bool functions to false and true, respectively Signed-off-by: Dave Ertman <davidx.m.ertman@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-04-23ixgbevf: remove 82599 from the module descriptionEmil Tantilov1-1/+1
This patch removes 82599 from the description of the ixgbevf module since the VF driver is supported on other parts as well. Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-04-23ixgbe: improve mac filter handlingJacob Keller3-68/+172
Add mac_table API based on work done for igb, which includes functions to add and delete mac filters. This simplifies code for various entities that use MAC filters such as VMDQ, SR-IOV, MACVLAN, and such. Reported-by: Mitch Williams <mitch.a.williams@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-04-22ixgbe: change handling of multicast filtersJacob Keller3-9/+53
In line with changes done by Alex Duyck regarding unicast filters, we now only set multicast filters when the interface is not in promiscuous mode for multicast packets. This also has an impact on the RAR usage such that SR-IOV has some RARs reserved for its own usage. Reported-by: Alex Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-04-22ixgbe: remove vlan_filter_disable and enable functionsJacob Keller1-34/+6
Previously these functions handled stripping setup as well, but this has already been removed from these functions. Rather than encapsulating this into a function, we can just do the work directly in set_rx_mode. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-04-22ixgbe: Use out-of-line function for register readsMark Rustad2-15/+28
Register reads are slow, so don't inline them. Size before: text data bss dec hex filename 226337 8280 552 235169 396a1 ixgbe.ko Size after: text data bss dec hex filename 194578 8280 552 203410 31a92 ixgbe.ko for about a 14% reduction in text size. Signed-off-by: Mark Rustad <mark.d.rustad@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-04-22ixgbe: convert low_water into an arrayJacob Keller7-35/+57
Since fc.high_water is an array, we should treat low_water as an array also. This allows the algorithm to output different values for different TCs, and then we can distinguish between them. In addition, this patch changes one path that didn't honor the return value from ixgbe_setup_fc. Reported-by: Aaron Salter <aaron.k.salter@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-04-22Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-nextDavid S. Miller16-166/+369
Jeff Kirsher says: ==================== Intel Wired LAN Driver Updates This series contains updates to i40e and i40evf. Greg provides two patches for i40e, the first adds the netdev ops to support the addition of static FDB entries in the physical function (PF) MAC/VLAN filter table so that the virtual functions (VFs) can communicate with bridged virtual Ethernet ports such as those provided by the virtio driver. The second is to fix an issue where the assignment of a port VLAN after it is already up and running requires the VF driver to be reloaded, so print a message warning the host administrator about the need to reload the VF driver. In addition, knock the VF offline so that it does not continue to receive traffic not on the port VLAN assigned to it. Jesse provides a patch for i40e and i40evf to unhide and enable the PREFENA field in the receive host memory cache (RX-HMC) for best performance. Mitch provides a i40e patch to implement the net device op for Tx bandwidth setting. Catherine removes a firmware workaround that is no longer needed with the latest firmware for i40e. She also provides some minor cleanups as well bumps the driver versions. Anjali provides a fix for i40e displaying IPv4 flow director filters which needed additional information to be communicated up above in order for it to be displayed correctly. Shannon adds tracking of the NVM busy state so that the driver won't allow a new NVM update command until a completion event is received from the current update. Updates the admin queue API to reflect recent changes in the firmware. Also rearranges the "if netdev" logic to prepare for handling non-netdev VSIs. Lastly rework the fdir setup and tear down to use the newly created i40e_vsi_open() and i40e_vsi_close(), which also fixes a memory leak of the FDIR queue buffer info structs across a reset. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-22Merge branch 'netlink-bind'David S. Miller8-35/+135
Richard Guy Briggs says: ==================== audit: implement multicast socket for journald This is a patch set Eric Paris and I have been working on to add a restricted capability read-only netlink multicast socket to kernel audit to enable userspace clients such as systemd/journald to receive audit logs, in addition to the bidirectional auditd userspace client. Currently, auditd has the CAP_AUDIT_CONTROL and CAP_AUDIT_WRITE capabilities (but uses CAP_NET_ADMIN). The CAP_AUDIT_READ capability will be added for use by read-only AUDIT_NLGRP_READLOG multicast group clients to the kaudit subsystem. This will remove the dependence on CAP_NET_ADMIN for the multicast read-only socket. Patches 1-3 provide a way for per-protocol bind functions to signal an error and to be able to clean up after themselves. The first netfilter cleanup patch has already been accepted by a netfilter maintainer, though I don't see it upstream yet, so it is included for completeness. The second patch adds the per-protocol bind function return code to signal to the netlink code that no further processing should be done and to undo the work already done. V1: This rev fixes a bug introduced by flattening the code in the last posting. *V2: This rev moves the per-protocol bind call above the socket exposure call and refactors out the unbind procedure. The third provides a way per protocol to undo bind actions on DROP. Patches 4-6 implement the audit multicast socket with capability checking. The fourth patch adds the bind function capability check to multicast join requests for audit. The fifth patch adds the audit log read multicast group. An assumption has been made that systemd/journald reside in the initial network namespace. This could be changed to check the actual network namespace of systemd/journald should this assumption no longer be true since audit now supports all network namespaces. This version of the patch now directly sends the broadcast when the packet is ready rather than waiting until it passes the queue. The sixth checks if any clients actually exist before sending. Since the net tree is busier than the audit tree, conflicts are more likely and the audit patches depend on the net patches, it is proposed to have the net tree carry this entire patchset for 3.16. Are the net maintainers ok with this? https://bugzilla.redhat.com/show_bug.cgi?id=887992 First posted: https://www.redhat.com/archives/linux-audit/2013-January/msg00008.html https://lkml.org/lkml/2013/1/27/279 Please find source for a test program at: http://people.redhat.com/rbriggs/audit-multicast-listen/ ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-22audit: send multicast messages only if there are listenersRichard Guy Briggs1-0/+3
Test first to see if there are any userspace multicast listeners bound to the socket before starting the multicast send work. Signed-off-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-22audit: add netlink multicast group for log readRichard Guy Briggs2-4/+55
Add a netlink multicast socket with one group to kaudit for "best-effort" delivery to read-only userspace clients such as systemd, in addition to the existing bidirectional unicast auditd userspace client. Currently, auditd is intended to use the CAP_AUDIT_CONTROL and CAP_AUDIT_WRITE capabilities, but actually uses CAP_NET_ADMIN. The CAP_AUDIT_READ capability is added for use by read-only AUDIT_NLGRP_READLOG netlink multicast group clients to the kaudit subsystem. This will safely give access to services such as systemd to consume audit logs while ensuring write access remains restricted for integrity. Signed-off-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-22audit: add netlink audit protocol bind to check capabilities on multicast joinRichard Guy Briggs3-2/+17
Register a netlink per-protocol bind fuction for audit to check userspace process capabilities before allowing a multicast group connection. Signed-off-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-22netlink: implement unbind to netlink_setsockopt NETLINK_DROP_MEMBERSHIPRichard Guy Briggs1-1/+3
Call the per-protocol unbind function rather than bind function on NETLINK_DROP_MEMBERSHIP in netlink_setsockopt(). Signed-off-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-22netlink: have netlink per-protocol bind function return an error code.Richard Guy Briggs4-24/+56
Have the netlink per-protocol optional bind function return an int error code rather than void to signal a failure. This will enable netlink protocols to perform extra checks including capabilities and permissions verifications when updating memberships in multicast groups. In netlink_bind() and netlink_setsockopt() the call to the per-protocol bind function was moved above the multicast group update to prevent any access to the multicast socket groups before checking with the per-protocol bind function. This will enable the per-protocol bind function to be used to check permissions which could be denied before making them available, and to avoid the messy job of undoing the addition should the per-protocol bind function fail. The netfilter subsystem seems to be the only one currently using the per-protocol bind function. Signed-off-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-22netlink: simplify nfnetlink_bindRichard Guy Briggs1-5/+2
Remove duplicity and simplify code flow by moving the rcu_read_unlock() above the condition and let the flow control exit naturally at the end of the function. Signed-off-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>