aboutsummaryrefslogtreecommitdiffstats
path: root/block/bfq-iosched.c
diff options
context:
space:
mode:
authorDavid S. Miller <davem@davemloft.net>2018-12-09 21:27:48 -0800
committerDavid S. Miller <davem@davemloft.net>2018-12-09 21:43:31 -0800
commit4cc1feeb6ffc2799f8badb4dea77c637d340cb0d (patch)
treec41c1e4c05f016298246ad7b3a6034dc1e65c154 /block/bfq-iosched.c
parentnet: dsa: Make dsa_master_set_mtu() static (diff)
parentLinux 4.20-rc6 (diff)
downloadlinux-dev-4cc1feeb6ffc2799f8badb4dea77c637d340cb0d.tar.xz
linux-dev-4cc1feeb6ffc2799f8badb4dea77c637d340cb0d.zip
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Several conflicts, seemingly all over the place. I used Stephen Rothwell's sample resolutions for many of these, if not just to double check my own work, so definitely the credit largely goes to him. The NFP conflict consisted of a bug fix (moving operations past the rhashtable operation) while chaning the initial argument in the function call in the moved code. The net/dsa/master.c conflict had to do with a bug fix intermixing of making dsa_master_set_mtu() static with the fixing of the tagging attribute location. cls_flower had a conflict because the dup reject fix from Or overlapped with the addition of port range classifiction. __set_phy_supported()'s conflict was relatively easy to resolve because Andrew fixed it in both trees, so it was just a matter of taking the net-next copy. Or at least I think it was :-) Joe Stringer's fix to the handling of netns id 0 in bpf_sk_lookup() intermixed with changes on how the sdif and caller_net are calculated in these code paths in net-next. The remaining BPF conflicts were largely about the addition of the __bpf_md_ptr stuff in 'net' overlapping with adjustments and additions to the relevant data structure where the MD pointer macros are used. Signed-off-by: David S. Miller <davem@davemloft.net>
Diffstat (limited to 'block/bfq-iosched.c')
-rw-r--r--block/bfq-iosched.c76
1 files changed, 54 insertions, 22 deletions
diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
index 3a27d31fcda6..97337214bec4 100644
--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
@@ -638,7 +638,7 @@ static bool bfq_varied_queue_weights_or_active_groups(struct bfq_data *bfqd)
bfqd->queue_weights_tree.rb_node->rb_right)
#ifdef CONFIG_BFQ_GROUP_IOSCHED
) ||
- (bfqd->num_active_groups > 0
+ (bfqd->num_groups_with_pending_reqs > 0
#endif
);
}
@@ -802,7 +802,21 @@ void bfq_weights_tree_remove(struct bfq_data *bfqd,
*/
break;
}
- bfqd->num_active_groups--;
+
+ /*
+ * The decrement of num_groups_with_pending_reqs is
+ * not performed immediately upon the deactivation of
+ * entity, but it is delayed to when it also happens
+ * that the first leaf descendant bfqq of entity gets
+ * all its pending requests completed. The following
+ * instructions perform this delayed decrement, if
+ * needed. See the comments on
+ * num_groups_with_pending_reqs for details.
+ */
+ if (entity->in_groups_with_pending_reqs) {
+ entity->in_groups_with_pending_reqs = false;
+ bfqd->num_groups_with_pending_reqs--;
+ }
}
}
@@ -3529,27 +3543,44 @@ static bool bfq_better_to_idle(struct bfq_queue *bfqq)
* fact, if there are active groups, then, for condition (i)
* to become false, it is enough that an active group contains
* more active processes or sub-groups than some other active
- * group. We address this issue with the following bi-modal
- * behavior, implemented in the function
+ * group. More precisely, for condition (i) to hold because of
+ * such a group, it is not even necessary that the group is
+ * (still) active: it is sufficient that, even if the group
+ * has become inactive, some of its descendant processes still
+ * have some request already dispatched but still waiting for
+ * completion. In fact, requests have still to be guaranteed
+ * their share of the throughput even after being
+ * dispatched. In this respect, it is easy to show that, if a
+ * group frequently becomes inactive while still having
+ * in-flight requests, and if, when this happens, the group is
+ * not considered in the calculation of whether the scenario
+ * is asymmetric, then the group may fail to be guaranteed its
+ * fair share of the throughput (basically because idling may
+ * not be performed for the descendant processes of the group,
+ * but it had to be). We address this issue with the
+ * following bi-modal behavior, implemented in the function
* bfq_symmetric_scenario().
*
- * If there are active groups, then the scenario is tagged as
+ * If there are groups with requests waiting for completion
+ * (as commented above, some of these groups may even be
+ * already inactive), then the scenario is tagged as
* asymmetric, conservatively, without checking any of the
* conditions (i) and (ii). So the device is idled for bfqq.
* This behavior matches also the fact that groups are created
- * exactly if controlling I/O (to preserve bandwidth and
- * latency guarantees) is a primary concern.
+ * exactly if controlling I/O is a primary concern (to
+ * preserve bandwidth and latency guarantees).
*
- * On the opposite end, if there are no active groups, then
- * only condition (i) is actually controlled, i.e., provided
- * that condition (i) holds, idling is not performed,
- * regardless of whether condition (ii) holds. In other words,
- * only if condition (i) does not hold, then idling is
- * allowed, and the device tends to be prevented from queueing
- * many requests, possibly of several processes. Since there
- * are no active groups, then, to control condition (i) it is
- * enough to check whether all active queues have the same
- * weight.
+ * On the opposite end, if there are no groups with requests
+ * waiting for completion, then only condition (i) is actually
+ * controlled, i.e., provided that condition (i) holds, idling
+ * is not performed, regardless of whether condition (ii)
+ * holds. In other words, only if condition (i) does not hold,
+ * then idling is allowed, and the device tends to be
+ * prevented from queueing many requests, possibly of several
+ * processes. Since there are no groups with requests waiting
+ * for completion, then, to control condition (i) it is enough
+ * to check just whether all the queues with requests waiting
+ * for completion also have the same weight.
*
* Not checking condition (ii) evidently exposes bfqq to the
* risk of getting less throughput than its fair share.
@@ -3607,10 +3638,11 @@ static bool bfq_better_to_idle(struct bfq_queue *bfqq)
* bfqq is weight-raised is checked explicitly here. More
* precisely, the compound condition below takes into account
* also the fact that, even if bfqq is being weight-raised,
- * the scenario is still symmetric if all active queues happen
- * to be weight-raised. Actually, we should be even more
- * precise here, and differentiate between interactive weight
- * raising and soft real-time weight raising.
+ * the scenario is still symmetric if all queues with requests
+ * waiting for completion happen to be
+ * weight-raised. Actually, we should be even more precise
+ * here, and differentiate between interactive weight raising
+ * and soft real-time weight raising.
*
* As a side note, it is worth considering that the above
* device-idling countermeasures may however fail in the
@@ -5417,7 +5449,7 @@ static int bfq_init_queue(struct request_queue *q, struct elevator_type *e)
bfqd->idle_slice_timer.function = bfq_idle_slice_timer;
bfqd->queue_weights_tree = RB_ROOT;
- bfqd->num_active_groups = 0;
+ bfqd->num_groups_with_pending_reqs = 0;
INIT_LIST_HEAD(&bfqd->active_list);
INIT_LIST_HEAD(&bfqd->idle_list);