{"id":886,"date":"2014-06-21T05:28:20","date_gmt":"2014-06-21T05:28:20","guid":{"rendered":"http:\/\/poojanwagh.opalstacked.com\/techblog\/?p=886"},"modified":"2014-06-21T05:28:20","modified_gmt":"2014-06-21T05:28:20","slug":"keeping-freebsd-tcp-performance-in-the-midst-of-a-highly-buffered-connection","status":"publish","type":"post","link":"https:\/\/tech.poojanblog.com\/blog\/unix-linux\/keeping-freebsd-tcp-performance-in-the-midst-of-a-highly-buffered-connection\/","title":{"rendered":"Keeping FreeBSD TCP performance in the midst of a highly-buffered connection"},"content":{"rendered":"<p>I was perplexed recently, when I began an rsync job to a raspberry pi server. I know exactly what limits the bandwidth of this connection&#8211;it is the CPU (or network) on the Raspberry Pi, which cannot accept data fast enough.<\/p>\n<p>So, even though my server is on a 1 Gbit\/s interface, and the Raspberry Pi is on a 100 Mbit\/s interface, the transfer rate is ~ 10 Mbit\/s. Fair enough.<\/p>\n<p>But what really perplexed me is that the presence of this rsync connection severly limited other connections&#8211;notably Samba. The Simpson&#8217;s show in the living room had audio that was noticeably stuttering.<\/p>\n<p>So, I began to investigate. This same low-rate occurred with iperf. It seemed a little better from my basement computer than the living room machine. Here is an iperf from the basement to the FreeBSD server:<\/p>\n<p><code><br \/>\nC:\\Users\\Poojan\\Downloads\\iperf-2.0.5-2-win32&gt;iperf -c server<br \/>\n------------------------------------------------------------<br \/>\nClient connecting to server, TCP port 5001<br \/>\nTCP window size: 64.0 KByte (default)<br \/>\n------------------------------------------------------------<br \/>\n[ 3] local 192.168.1.20 port 64155 connected with 192.168.1.8 port 5001<br \/>\n[ ID] Interval Transfer Bandwidth<br \/>\n[ 3] 0.0-10.1 sec 25.1 MBytes 21.0 Mbits\/sec<br \/>\n<\/code><\/p>\n<p>Whereas without rsync going, it would be around 670 Mbit\/s or so.<\/p>\n<p>I started playing around with buffers. Curiously, reducing sendbuf_max helped:<\/p>\n<p><code><br \/>\nPoojan@server ~ &gt;sudo sysctl net.inet.tcp.sendbuf_max<br \/>\nnet.inet.tcp.sendbuf_max: 262144<br \/>\nPoojan@server ~ &gt;sudo sysctl net.inet.tcp.sendbuf_max=65536<br \/>\nnet.inet.tcp.sendbuf_max: 262144 -&gt; 65536<br \/>\n<\/code><\/p>\n<p>Which yielded:<br \/>\n<code><br \/>\nC:\\Users\\Poojan\\Downloads\\iperf-2.0.5-2-win32&gt;iperf -c server<br \/>\n------------------------------------------------------------<br \/>\nClient connecting to server, TCP port 5001<br \/>\nTCP window size: 64.0 KByte (default)<br \/>\n------------------------------------------------------------<br \/>\n[ 3] local 192.168.1.20 port 64171 connected with 192.168.1.8 port 5001<br \/>\n[ ID] Interval Transfer Bandwidth<br \/>\n[ 3] 0.0-10.0 sec 788 MBytes 661 Mbits\/sec<br \/>\n<\/code><\/p>\n<p>I posited that maybe there&#8217;s some overall limit to the buffers, and rsync was stealing all of them, so making them smaller allowed more buffers to be available to iperf. I went hunting for this limit.<\/p>\n<p>I tried doubling kern.ipc.maxsockbuf:<\/p>\n<p><code><br \/>\nPoojan@server ~ &gt;sudo sysctl -w kern.ipc.maxsockbuf=524288<br \/>\nkern.ipc.maxsockbuf: 262144 -&gt; 524288<br \/>\n<\/code><\/p>\n<p>which yielded:<br \/>\n<code><br \/>\nC:\\Users\\Poojan\\Downloads\\iperf-2.0.5-2-win32&gt;iperf -c server<br \/>\n------------------------------------------------------------<br \/>\nClient connecting to server, TCP port 5001<br \/>\nTCP window size: 64.0 KByte (default)<br \/>\n------------------------------------------------------------<br \/>\n[ 3] local 192.168.1.20 port 64216 connected with 192.168.1.8 port 5001<br \/>\n[ ID] Interval Transfer Bandwidth<br \/>\n[ 3] 0.0-10.0 sec 25.0 MBytes 20.9 Mbits\/sec<br \/>\n<\/code><\/p>\n<p>No luck. Note: I realized that the above was with Jumbo frames enabled on both server &amp; client. I disabled jumbo on client.<\/p>\n<p>I then did a <code>netstat -m<\/code>, just in case:<\/p>\n<p><code><br \/>\n1470\/5175\/6645 mbufs in use (current\/cache\/total)<br \/>\n271\/2635\/2906\/10485760 mbuf clusters in use (current\/cache\/total\/max)<br \/>\n271\/2635 mbuf+clusters out of packet secondary zone in use (current\/cache)<br \/>\n85\/335\/420\/762208 4k (page size) jumbo clusters in use (current\/cache\/total\/max)<br \/>\n1041\/361\/1402\/225839 9k jumbo clusters in use (current\/cache\/total\/max)<br \/>\n0\/0\/0\/127034 16k jumbo clusters in use (current\/cache\/total\/max)<br \/>\n10618K\/11152K\/21771K bytes allocated to network (current\/cache\/total)<br \/>\n1106\/2171\/531 requests for mbufs denied (mbufs\/clusters\/mbuf+clusters)<br \/>\n0\/0\/0 requests for mbufs delayed (mbufs\/clusters\/mbuf+clusters)<br \/>\n0\/0\/0 requests for jumbo clusters delayed (4k\/9k\/16k)<br \/>\n361\/1345\/0 requests for jumbo clusters denied (4k\/9k\/16k)<br \/>\n0 requests for sfbufs denied<br \/>\n0 requests for sfbufs delayed<br \/>\n0 requests for I\/O initiated by sendfile<br \/>\n<\/code><\/p>\n<p>This didn&#8217;t really show any indication that buffers were being over-subscribed, at least not during the tests.<\/p>\n<p>But now, with a sendbuf_max size of 262144, and a maxsockbuf size of 524288, my iperf reading went <em>down<\/em>:<\/p>\n<p><code><br \/>\nC:\\Users\\Poojan\\Downloads\\iperf-2.0.5-2-win32&gt;iperf -c server<br \/>\n------------------------------------------------------------<br \/>\nClient connecting to server, TCP port 5001<br \/>\nTCP window size: 64.0 KByte (default)<br \/>\n------------------------------------------------------------<br \/>\n[ 3] local 192.168.1.20 port 64438 connected with 192.168.1.8 port 5001<br \/>\n[ ID] Interval Transfer Bandwidth<br \/>\n[ 3] 0.0-10.7 sec 2.88 MBytes 2.25 Mbits\/sec<br \/>\n<\/code><\/p>\n<p>From reading this summary of FreeBSD buffers, it seems that kern.ipc.maxsockbuf operates at a different level than net.inet.tcp.sendbuf. And, in fact, both these being large is impacting the performance. So, maybe this is just pure buffer bloat.<\/p>\n<p>But, then I realized that my better results were when the sendbuf was less than 64k. So, I disabled RFC1323 (which allows for buffers larger than 64k, in addition to time-stamps). And voila!<\/p>\n<p><code><br \/>\nC:\\Users\\Poojan\\Downloads\\iperf-2.0.5-2-win32&gt;iperf -c server<br \/>\n------------------------------------------------------------<br \/>\nClient connecting to server, TCP port 5001<br \/>\nTCP window size: 64.0 KByte (default)<br \/>\n------------------------------------------------------------<br \/>\n[  3] local 192.168.1.20 port 65203 connected with 192.168.1.8 port 5001<br \/>\n[ ID] Interval       Transfer     Bandwidth<br \/>\n[  3]  0.0-10.0 sec   798 MBytes   669 Mbits\/sec<br \/>\n<\/code><\/p>\n<div class='wp_likes' id='wp_likes_post-886'><a class='like' href=\"javascript:wp_likes.like(886);\" title='' ><img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/tech.poojanblog.com\/blog\/wp-content\/plugins\/wp-likes\/images\/like.png\" alt='' border='0'\/><\/a><span class='text'>Be the first to like.<\/span><\/p>\n<div class='like' ><a href=\"javascript:wp_likes.like(886);\">Like<\/a><\/div>\n<div class='unlike' ><a href=\"javascript:wp_likes.unlike(886);\">Unlike<\/a><\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>I was perplexed recently, when I began an rsync job to a raspberry pi server. I know exactly what limits the bandwidth of this connection&#8211;it is the CPU (or network) on the Raspberry Pi, which cannot accept data fast enough. So, even though my server is on a 1 Gbit\/s interface, and the Raspberry Pi [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[10],"tags":[12,49,214,215,150],"class_list":["post-886","post","type-post","status-publish","format-standard","hentry","category-unix-linux","tag-freebsd","tag-network","tag-performanc","tag-rfc1323","tag-tcp"],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/tech.poojanblog.com\/blog\/wp-json\/wp\/v2\/posts\/886","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/tech.poojanblog.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/tech.poojanblog.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/tech.poojanblog.com\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/tech.poojanblog.com\/blog\/wp-json\/wp\/v2\/comments?post=886"}],"version-history":[{"count":9,"href":"https:\/\/tech.poojanblog.com\/blog\/wp-json\/wp\/v2\/posts\/886\/revisions"}],"predecessor-version":[{"id":896,"href":"https:\/\/tech.poojanblog.com\/blog\/wp-json\/wp\/v2\/posts\/886\/revisions\/896"}],"wp:attachment":[{"href":"https:\/\/tech.poojanblog.com\/blog\/wp-json\/wp\/v2\/media?parent=886"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/tech.poojanblog.com\/blog\/wp-json\/wp\/v2\/categories?post=886"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/tech.poojanblog.com\/blog\/wp-json\/wp\/v2\/tags?post=886"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}