NET_DMA TCP Gets Linux Offload

The Linux kernel has the ability to enable the offload feature when sending TCP ( CONFIG_NET_DMA). I used iperf(with a window size of TCP = 250 KB and a buffer length = 2 MB) and oprofile to test performance in three cases: with NET_DMA support and the inclusion of NET_DMA, and sk_rcvlowat200 KB. The results are as follows:

  • with the NET_DMA function disabled: the bandwidth can reach 930 Mbit / s, __copy_tofrom_userconsumes 36.1% of the processor time.

  • with NET_DMA support enabled: bandwidth is less than 40 Mbps (890 Mbps) in the above case, __copy_tofrom_userconsumes 33.5% of the processor time.

  • with NET_DMA support enabled (sk_rcvlowat = 200 KB): the bandwidth is 874 Mb / s, __copy_tofrom_userconsumes 25.1% of the processor time.

I also tried to check the tcp_recvmsg () function (in / net / ipv4 / tcp.c) (the kernel version is 2.6.32.2). This is how I understand how NET_DMA works:

// at the beginning of tcp_revmsg ()

   target = sock_rcvlowat(sk, flags & MSG_WAITALL, len);

#ifdef CONFIG_NET_DMA

   tp->ucopy.dma_chan = NULL;

   preempt_disable();

   skb = skb_peek_tail(&sk->sk_receive_queue);

   {
           int available = 0;

           if (skb)
                   available = TCP_SKB_CB(skb)->seq + skb->len - (*seq);
           if ((available < target) &&
               (len > sysctl_tcp_dma_copybreak) && !(flags & MSG_PEEK) &&
               !sysctl_tcp_low_latency &&
               dma_find_channel(DMA_MEMCPY)) {
                   preempt_enable_no_resched();
                   tp->ucopy.pinned_list =
                                   dma_pin_iovec_pages(msg->msg_iov, len);
           } else {
                   preempt_enable_no_resched();
           }
   }

#endif

len: buffer length that can be specified with the option -liniperf

target: minimum number of bytes tcp_recvmsg()should be returned. if sk->sk_rcvlowatnot set, I saw that the target usually gets the value 1 (DMA transmission rarely happens in case target= 1).

available: The number of bytes available from the first skbof the receive queue.

, (target < available) , tcp_recvmsg() DMA . I/OAT, , , .

// while tcp_recvmsg()

if ( >= ) {

   /* Do not sleep, just process backlog. */

   release_sock(sk);

   lock_sock(sk);

} else

   sk_wait_data(sk, &timeo);

, tcp_dma_try_early_copy() tcp_rcv_established() ( /net/ipv4/tcp_input.c). , NET_DMA, , .

///net/ipv4/tcp_input.c:tcp_dma_try_early_copy()

if ((tp- > ucopy.len == 0) ||

   (tcp_flag_word(tcp_hdr(skb)) & TCP_FLAG_PSH) ||

   (atomic_read(&sk->sk_rmem_alloc) > (sk->sk_rcvbuf >> 1))) {

       tp->ucopy.wakeup = 1;

       sk->sk_data_ready(sk, 0);

}

DMA tcp_dma_try_early_copy() , (tp->ucopy.len == 0), skb 1/2 sk_rcvbuf ( , sk_rcvbuf TCP iperf).

, TCP/IP Linux. , , , , , . :

Q1: NET_DMA , NET_DMA?

Q2: ( TCP, , sk_rcvlowat), NET_DMA?

Q3: DMA 1448 . , DMAed?

. .

+3
1

, ( 1448 ), IOAT , , , . 5 / memcpy.

10Gbit/sec Ethernet MTU, , , , . , , PAGE_SIZE.

+2

All Articles