[Devel,rh7,2/3] net: core: use atomic high-order allocations

Submitted by Anatoly Stepanov on Oct. 29, 2016, 2:55 p.m.

Details

Message ID 1477752938-847836-3-git-send-email-astepanov@cloudlinux.com
State New
Series "net: core: optimize high-order allocations"
Headers show

Commit Message

Anatoly Stepanov Oct. 29, 2016, 2:55 p.m.
As we detected intensive direct reclaim activity in
sk_page_frag_refill()
it's reasonable to prevent it from trying so hard to allocate high-order
blocks, just do it when it's effortless.

This is a port of upstream (vanilla) change.
Original commit: fb05e7a89f500cfc06ae277bdc911b281928995d

We saw excessive direct memory compaction triggered by
skb_page_frag_refill.
This causes performance issues and add latency. Commit 5640f7685831e0
introduces the order-3 allocation. According to the changelog, the
order-3
allocation isn't a must-have but to improve performance. But direct
memory
compaction has high overhead. The benefit of order-3 allocation can't
compensate the overhead of direct memory compaction.

This patch makes the order-3 page allocation atomic. If there is no
memory
pressure and memory isn't fragmented, the alloction will still success,
so we
don't sacrifice the order-3 benefit here. If the atomic allocation
fails,
direct memory compaction will not be triggered, skb_page_frag_refill
will
fallback to order-0 immediately, hence the direct memory compaction
overhead is

avoided. In the allocation failure case, kswapd is waken up and doing
compaction, so chances are allocation could success next time.

alloc_skb_with_frags is the same.

The mellanox driver does similar thing, if this is accepted, we must fix
the driver too.

V3: fix the same issue in alloc_skb_with_frags as pointed out by Eric
V2: make the changelog clearer

Cc: Eric Dumazet <edumazet@google.com>
Cc: Chris Mason <clm@fb.com>
Cc: Debabrata Banerjee <dbavatar@gmail.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[ luis: backported to 3.16: used davem's backport to 3.14 ]
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Signed-off-by: Anatoly Stepanov <astepanov@cloudlinux.com>
---
 net/core/sock.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

Patch hide | download patch | download mbox

diff --git a/net/core/sock.c b/net/core/sock.c
index a94e1d0..763bd5d 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1816,7 +1816,7 @@  struct sk_buff *sock_alloc_send_pskb(struct sock *sk, unsigned long header_len,
 
 			while (order) {
 				if (npages >= 1 << order) {
-					page = alloc_pages(sk->sk_allocation |
+					page = alloc_pages((sk->sk_allocation & ~__GFP_WAIT)|
 							   __GFP_COMP |
 							   __GFP_NOWARN |
 							   __GFP_NORETRY,
@@ -1874,14 +1874,15 @@  bool sk_page_frag_refill(struct sock *sk, struct page_frag *pfrag)
 		put_page(pfrag->page);
 	}
 
-	/* We restrict high order allocations to users that can afford to wait */
-	order = (sk->sk_allocation & __GFP_WAIT) ? SKB_FRAG_PAGE_ORDER : 0;
+	order = SKB_FRAG_PAGE_ORDER;
 
 	do {
 		gfp_t gfp = sk->sk_allocation;
 
-		if (order)
+		if (order) {
 			gfp |= __GFP_COMP | __GFP_NOWARN | __GFP_NORETRY;
+			gfp &= ~__GFP_WAIT;
+		}
 		pfrag->page = alloc_pages(gfp, order);
 		if (likely(pfrag->page)) {
 			pfrag->offset = 0;