Home
Performance Tuning Guidelines for Mellanox Network Adapters
Contents
1. an 19 3 9 1 Handling Huge Pagos 19 3 9 2 Reducing Memory Footprint 19 3 9 9 Poling Configu raliolls unie nani zi Una eroe t tete ec E 20 3 9 4 Handling Single Threaded Processes sse 20 9 5 Reducing DMAE trke rara al mali iue ois 20 N 3 Revision 1 10 Contents 4 Performance Tuning for Virtualized 21 417 Tunng for Fypervisor nire Ria 21 5 Performance Tuning for WindOWsS 22 5 1 Tuning the Network nnne enn 22 5 2 Tuning for NUMA Architecture terrere Mr e e e pe eire Urlo 23 5 2 1 Tuning for Intel Microarchitecture Code name Sandy Bridge 23 5 2 2 Tuning for AMD 23 5 2 3 Running an Application on a Certain NUMA 23 5 3 Tuning for Windows Server 2012 ii 23 5 3 1 Recognizing NUMA Node Cores 23 5 4 Finding the Closest NUMA Node to the NIC 24 5 5 Tuning for Windows 2008 R2 cece ener ee eee ae ee ee eae eset 25 5 5 1 Tuning for Multiple Adapters AAEE 25 5 5 2 Recognizing NUMA Node Cores
2. i 25 5 6 e oet edet t ER 26 Contents Revision 1 10 List of Tables Table 1 Document Revision History i 6 Table 2 Recommended PCle 9 Revision 1 10 Revision History Revision History Table 1 Document Revision History Revision Date Description 1 10 December 2013 e Updated section Performance Testing on page 26 1 10 October 2013 e Updated section Kernel Idle Loop Tuning on page 14 e Added section Performance Tuning for Virtualized Environment on page 21 1 9 September 2013 e Updated section Interrupt Moderation on page 15 1 8 June 2013 e Removed section Tuning for Windows Server 2008 and its sub sections e Added the following sections Recognizing NUMA Node Cores on page 23 Finding the Closest NUMA Node to the NIC on page 24 1 7 April 2013 e Updated the following sections Recommended BIOS Settings on page 10 Tuning Power Management on page 13 Tuning for Intel Sandy Bridge on page 16 IRQ Affinity Configuration on page 17 Tuning Multi Threaded IP Forwarding on page 18 Tuning for Multiple Adapters on page 18 e Replaced Tuning for IPoIB Interfaces with Auto Tuning Utility on page 18 e Added section Improving Application Performance on Remote NUMA Node on page 16 1 6 Octobe
3. affinity by default e Add the line below to the etc infiniband openib conf file RUN AFFINITY TUNER yes 3 7 3 Tuning for Multiple Adapters When optimizing the system performance for using more than one adapter It is recommended to separate the adapter s core utilization so there will be no interleaving between interfaces The following script can be used to separate each adapter s IRQs to different set of cores set irq affinity cpulist sh cpu list interface lt cpu list can be either a comma separated list of single core numbers 0 1 2 3 or core groups 0 3 Example If the system has 2 adapters on the same NUMA node 0 7 each with 2 interfaces run the following etc init d irqbalancer stop SSE Lro pulse Sim dei set irg Coulisic sn 223 eim get irg e sieve pui stes 4 5 ead Set irg affinity cpulist sa 6 7 cus Se db ob SHE HE 3 8 Tuning Multi Threaded IP Forwarding gt To optimize NIC usage as IP forwarding 1 Set the following options in etc modprobe d mlx4 conf For MLNX_OFED 2 0 x options mlx4 en inline thold 0 ptione Wilke core higa mate stesr 1 ForMLNX EN 1 5 10 options mlx4 en num lro 0 inline thold 0 Gptione mi4 Core miga rare sicesi i X SRM Performance Tuning Guidelines for Mellanox Network Adapters Revision 1 10 2 Apply interrupt affinity tuning 3 Forwarding on the same interface set irq affinity bynode sh node
4. e Added new chapter Tuning VMA Parameters 1 1 e Updated the following sections Intel Hyper Threading Technology Tuning the Network Adapter for Improved IPv4 Traffic Performance Example Script for Setting Interrupt Affinity e Added new section Tuning IP Forwarding Revision 1 10 Introduction 1 1 Introduction Depending on the application of the user s system it may be necessary to modify the default configuration of network adapters based on the ConnectX adapters This document describes important tuning parameters and settings that can improve performance for Mellanox drivers Each setting along with its potential effect is described to help in making an informed Judgment concerning its relevance to the user s system the system workload and the performance goals Tuning is relevant for both Ethernet and IPoIB network interfaces Relevant Mellanox Drivers The tuning guidelines described in this document apply to the following Mellanox Software drivers e On Linux Mellanox Ethernet Driver MLNX_EN for Linux version 1 5 10 and later e On Linux Mellanox VPI Driver MLNX OFED for Linux version 2 0 x and later e On Windows Mellanox OFED for Windows MLNX VPI version 4 40 and later Performance Tuning Guidelines for Mellanox Network Adapters Revision 1 10 2 General System Configurations The following sections describe recommended configurations for s
5. interface 4 Forwarding from one interface to another set irag affinity ymocs sii uum nocle gt umueriscelo umbestace27 5 Disable adaptive interrupt moderation and set status values using ethtool C adaptive rx off 3 9 Tuning VMA Parameters This section provides guidelines for improving performance with VMA It is intended for administrators who are familiar with VMA and should be used in conjunction with the VMA User Manual and the VMA Release Notes You can minimize latency by tuning VMA parameters It is recommended to test VMA performance tuning on an actual application We suggest that you try the following VMA parameters one by one and in combination to find the optimum for your application For more information about each parameter see the VMA User Manual To perform tuning add VMA configuration parameters when you run VMA after LD_PRELOAD for example LD PRELOAD libvma so VMA MTU 200 my application 3 9 1 Handling Huge Pages Improve the handling of huge pages e Before running VMA enable Kernel and VMA huge table for example echo 1000000000 gt proc sys kernel shmmax echo 400 gt proc sys vm nr hugepages Note Increase the amount of shared memory bytes and huge pages if you receive a warning about insufficient number of huge pages allocated in the system e Enable VMA HUGETBL to improve receive and send performance When enabled attempts to allocate data buffers as huge
6. Architecture On AMD architecture there is a difference between a 2 socket system and a 4 socket system e With a2 socket system the PCIe adapter will be connected to socket 0 nodes 0 1 e With a4 socket system the PCIe adapter will be connected either to socket 0 nodes 0 1 or to socket 3 nodes 6 7 Performance Tuning Guidelines for Mellanox Network Adapters Revision 1 10 3 6 3 Recognizing NUMA Node Cores gt To recognize NUMA node cores run the following command cat sys devices system node node X cpulist cpumap Example cat sys devices system node nodel cpulist do Sip Sir Vp 97 LIL 3095 LS cat sys devices system node nodel cpumap 0000aaaa 3 6 3 1 Running an Application on a Certain NUMA Node In order to run an application on a certain NUMA node the process affinity should be set in either in the command line or an external tool For example if the adapter s NUMA node is 1 and NUMA 1 cores are 8 15 then an application should run with process affinity that uses 8 15 cores only Torun an application run the following commands tasks e e lo do wiite ow ca or taskset Oxff00 ib write bw a 3 7 IRQ Affinity The affinity of an interrupt is defined as the set of processor cores that service that interrupt To improve application scalability and latency it is recommended to distribute interrupt requests IRQs between the available processor cores To prevent the Linux IRQ ba
7. tuning can be performed using the User Interface as well For further information please refer to section Tuning the Network Adapter on page 22 5 5 1 Tuning for Multiple Adapters When optimizing the system performance for using more than one adapter It is recommended to separate the adapter s core utilization so there will be no interleaving between interfaces Please use the perf tuning exe manual option to separate each adapter s cores to different set of cores perf tuning exe m cl first connection name b base RSS processor number gt n number of RSS processors gt Example If the system has 2 adapters on the same NUMA node 0 7 each with 2 interfaces run the following perf tuning exe m cl first connection name b 0 n perf tuning exe m cl first connection name b 2 n perf tuning exe m cl first connection name b 4 n perf tuning exe m cl first connection name b 6 n 5 5 2 Recognizing NUMA Node Cores gt To recognize NUMA node cores perform the following 1 Open the Task Manager 25 Revision 1 10 Performance Tuning for Windows 2 Goto the Processes tab 3 Right click on one of the processes and choose Set affinity A table of the available cores and NUMA nodes will be displayed 5 6 Performance Testing The preferred tool for performance testing is NTttcp The tool was developed by Microsoft and it is well optimize
8. 04 sysctl w net core wmem_max 4194304 sysctl w net core rmem default 4194304 sysctl w net core wmem default 4194304 sysctl w net core optmem max 4194304 e Increase memory thresholds to prevent packet dropping sysctl w net ipv4 tcp rmem 4096 87380 4194304 sysctl w net ipv4 tcp wmem 4096 65536 4194304 e Enable low latency mode for TCP sysctl u near rp teo lom lareney i 3 2 Tuning the Network Adapter for Improved IPv6 Traffic Performance The following changes are recommended for improving IPv6 traffic performance e Disable the TCP timestamps option for better CPU utilization escl w net Hovi Mes mp e Enable the TCP selective acks option for better throughput Sys Cw nativi cS Performance Tuning Guidelines for Mellanox Network Adapters Revision 1 10 3 3 Preserving Your Performance Settings after a Reboot To preserve your performance settings after a reboot you need to add them to the file etc sysctl conf as follows sysctl namel gt lt value1 gt sysctl name2 gt lt value2 gt lt sysctl name3 gt lt value3 gt lt sysctl name4 gt lt value4 gt For example Tuning the Network Adapter for Improved IPv4 Traffic Performance on page 12 lists the following setting to disable the TCP timestamps option sysctl w net ipv4 tcp timestamps 0 In order to keep the TCP timestamps option disabled after a reboot add the following line to etc sysctl conf net ipv4 tcp ti
9. H DAMAGE Mellanox TECHNOLOGIES Mellanox Technologies Mellanox Technologies Ltd 350 Oakmead Parkway Suite 100 Beit Mellanox Sunnyvale CA 94085 PO Box 586 Yokneam 20692 U S A Israel www mellanox com www mellanox com Tel 408 970 3400 Tel 972 0 74 723 7200 Fax 408 970 3403 Fax 972 0 4 959 3245 Copyright 2013 Mellanox Technologies All Rights Reserved Mellanox Mellanox logo BridgeX ConnectX CORE Direct InfiniBridge InfiniHost InfiniScale MLNX OS PhyX SwitchX UFM Virtual Protocol Interconnect and Voltaire are registered trademarks of Mellanox Technologies Ltd Connect IB ExtendX FabricIT Mellanox Open Ethernet Mellanox Virtual Modular Switch MetroX MetroDX ScalableHPC Unbreakable Link are trademarks of Mellanox Technologies Ltd All other trademarks are property of their respective owners 2 Document Number 3368 Contents Revision 1 10 Contents Revisi n HISLOELy 2 eter eerie T 6 1 iftfoductioni eet ia eie cei 8 1 1 Relevant Mellanox Drivers 8 2 General System 9 2 1 PCI Express PCle CapabilitieS nennen 9 2 2 Memory Configu AON ain cote qe tice iL iaia 9 2 3 Recommended BIOS Settings 10 2 38 Generals uttter tese tut eum iena Lie i in 10 2 3 2 Intel Sandy Bri
10. MA node closest to the physical PCI slot where the NIC is installed to the NUMA node where processor core C resides We recommend using only cores that have p 0 implying they are within the closest NUMA node to the NIC 24 Performance Tuning Guidelines for Mellanox Network Adapters Revision 1 10 5 5 Tuning for Windows 2008 R2 gt NOTE perf_tuning exe is supported in Windows 2008 R2 only Please use the perf_tuning exe tool that comes with MLNX_ VPI driver It will recognize the adapter s NUMA node automatically and set the relevant registry keys accordingly This tool is based on information retrieved from a tuning document that can be found here http msdn microsoft com en us windows hardware gg463392 aspx The following are the auto tuning options e Optimized for single port use when most of the traffic is utilizing one of the NIC ports OSE cumino e Cl meus e Optimized for dual port use when most of the traffic is utilizing both of the NIC ports peri Cuming e C ci connection neme gt O2 lt secone COMMECELOM memes e Optimized for IP Routing RFC2544 pert cumino eze i 5 C2 5 on ames e For multicast streams tuning e ano aci lt fi Connection nemne gt o ee ONCECOnn eC e mS e For single connection applications SIE CUMING SxS asc sci siria mau Auto
11. Mellanox TECHNOLOGIES Performance Tuning Guidelines for Mellanox Network Adapters Revision 1 10 Last Updated December 04 2013 www mellanox com NOTE THIS HARDWARE SOFTWARE OR TEST SUITE PRODUCT PRODUCT S AND ITS RELATED DOCUMENTATION ARE PROVIDED BY MELLANOX TECHNOLOGIES AS IS WITH ALL FAULTS OF ANY KIND AND SOLELY FOR THE PURPOSE OF AIDING THE CUSTOMER IN TESTING APPLICATIONS THAT USE THE PRODUCTS IN DESIGNATED SOLUTIONS THE CUSTOMER S MANUFACTURING TEST ENVIRONMENT HAS NOT MET THE STANDARDS SET BY MELLANOX TECHNOLOGIES TO FULLY QUALIFY THE PRODUCTO S AND OR THE SYSTEM USING IT THEREFORE MELLANOX TECHNOLOGIES CANNOT AND DOES NOT GUARANTEE OR WARRANT THAT THE PRODUCTS WILL OPERATE WITH THE HIGHEST QUALITY ANY EXPRESS OR IMPLIED WARRANTIES INCLUDING BUT NOT LIMITED TO THE IMPLIED WARRANTIES OF MERCHANTABILITY FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT ARE DISCLAIMED IN NO EVENT SHALL MELLANOX BE LIABLE TO CUSTOMER OR ANY THIRD PARTIES FOR ANY DIRECT INDIRECT SPECIAL EXEMPLARY OR CONSEQUENTIAL DAMAGES OF ANY KIND INCLUDING BUT NOT LIMITED TO PAYMENT FOR PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES LOSS OF USE DATA OR PROFITS OR BUSINESS INTERRUPTION HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY WHETHER IN CONTRACT STRICT LIABILITY OR TORT INCLUDING NEGLIGENCE OR OTHERWISE ARISING IN ANY WAY FROM THE USE OF THE PRODUCT S AND RELATED DOCUMENTATION EVEN IF ADVISED OF THE POSSIBILITY OF SUC
12. by default The algorithm checks the transmission Tx and receive Rx packet rates and modifies the Rx interrupt moderation settings accordingly To manually set Tx and or Rx interrupt moderation use the ethtool utility For example the following commands first show the current default setting of interrupt moderation on the interface eth1 then turns off Rx interrupt moderation and last shows the new setting gt ethtool c ethl Coalesce parameters for ethl Adaptive RX on TX off pkt rate low 400000 pkt rate high 450000 i SE SEMO rx frames 88 rx usecs irq 0 gt ethtool C ethl adaptive rx off rx usecs 0 rx frames 0 gt ethtool c ethl Coalesce parameters for ethl Adaptive RX off TX off pkt rate low 400000 pkt rate high 450000 rx usecs 0 rx frames 0 rx frames irg 0 Note When working with a IGbE network it is recommended to disable the interrupt moderation in order to get a full 1GbE throughput do so run ethtool C 11 adaptive rx off rx usecs 0 rx frames 0 Revision 1 10 Performance Tuning for Linux 3 6 3 6 1 3 6 1 1 3 6 2 Tuning for NUMA Architecture Tuning for Intel Sandy Bridge Platform The Intel Sandy Bridge processor has an integrated PCI express controller Thus every PCle adapter OS is connected directly to a NUMA node On a system with more than one NUMA node performance will be better when using the local NUMA node to w
13. d for Windows operating systems Command line example e Receiver incita 64 ex ci ct 15 am 16 7 265 I e Sender MEEECO zol exe 28 ct 15 am 15 7 seme address as More details and tool binaries can be found here http gallery technet microsoft com NTttcp Version 528 Now f8b12769 26
14. dge 55 10 2 3 3 Intel Nehalem Westmere Processors 11 2 3 4 55 5 11 3 Performance Tuning for 12 3 4 Tuning the Network Adapter for Improved IPv4 Traffic Performance 12 3 2 Tuning the Network Adapter for Improved IPv6 Traffic Performance 12 3 3 Preserving Your Performance Settings after a Reboot ii 13 3 4 Tuning Power Management e 13 3 4 4 Checking Core 13 3 4 2 Setting the Scaling Governor ii 14 3 4 3 Kernel Idle Loop 14 3 44 OS Controlled Power 14 3 5 Interrupt Moderation leere creencia eere Ye Leder uere uoces 15 3 6 Tuning for NUMA 16 3 6 1 Tuning for Intel Sandy Bridge 16 3 6 2 Tuning for AMD 16 3 6 3 Recognizing NUMA Node Cores 17 KAMEN pleni US 17 3 7 1 IRQ Affinity Configuration 17 3 79 Auto Tuning Utility i i ica 18 3 7 3 Tuning for Multiple 18 3 8 Tuning Multi Threaded IP 18 39 Tuning VMA Parametere arsenem
15. enabled CPU frequency select Max performance Memory Memory speed Max performance Memory channel mode Independent Node Interleaving Disabled NUMA Channel Interleaving Enabled Thermal Mode Performance Hyper Threading can increase message rate for multi process applications by having more logical cores It might increase the latency of a single process due to lower frequency of a single logical core when hyper threading is enabled eee Performance Tuning Guidelines for Mellanox Network Adapters Revision 1 10 2 3 3 Intel Nehalem Westmere Processors The following table displays the recommended BIOS settings in machines with Intel Nehalem based processors BIOS Option Values General Operating Mode Power profile Maximum Performance Processor C States Disabled Turbo mode Disabled Hyper Threading Disabled Recommended for latency and message rate sensitive applications CPU frequency select Max performance Memory Memory speed Max performance Memory channel mode Independent Node Interleaving Disabled NUMA Channel Interleaving Enabled Thermal Mode Performance 2 3 4 AMD Processors The following table displays the recommended BIOS settings in machines with AMD based processors BIOS Option Values General Operating Mode Power profile Maximum Performance Processor C States Disabled Turbo mode Disabled HPC Optimizations Enab
16. her in the command line or an external tool For example if the adapter s NUMA node is 1 and NUMA 1 cores are 8 15 then an application should run with process affinity that uses 8 15 cores only gt To run an application run the following commands start affinity Oxff00 nd write bw S C ip 5 3 Tuning for Windows Server 2012 5 3 1 Recognizing NUMA Node Cores gt To recognize NUMA node cores perform the following 1 Open the Task Manager 2 Go to the Performance tab 3 Choose CPU 4 Right click on graph and choose Change graph to gt Logical processors Hovering over a CPU will display its NUMA node l Revision 1 10 Performance Tuning for Windows 5 4 Finding the Closest NUMA Node to the NIC Note BIOS support for ACPI SLIT must be enabled gt To find the closest NUMA node to the NIC perform the following 1 Open a PowerShell window 2 Execute Get NetAdapterRss lt Connection Where lt connection Name gt is the name assigned to the desired interface e g Ethernet 1 Expected output erod Get NetAdapterRss Name Ethernet 4 Ethernet 4 nox ConnectX 3 Ethernet Adapter 2 umber ay Group Number NUMA Distance IndirectionTable Group Number The RssProcessorArray field displays the closer NUMA node The array should have entries that are of the form G C D G The processor group C The processor core ID D The distance between the NU
17. hich the PCIe adapter is connected In order to identify which NUMA node is the adapter s node the system BIOS should support ACPI SLIT To see if your system supports PCle adapter s NUMA node detection cat sys class net interface device numa node Care eye davices PCI we mocks Example for supported system cat sys class net eth3 device numa node 0 Example for unsupported system cat sys class net ib0 device numa node 1 Improving Application Performance on Remote NUMA Node Verbs API applications that mostly use polling will have an impact when using the remote NUMA node libmlx4 has a build in enhancement that recognizes an application that is pinned to a remote NUMA node and activates a flow that improves the out of the box latency and throughput However the NUMA node recognition must be enabled as described in section 3 6 1 In systems which do not support SLIT the following environment variable should be applied MLX4 LOCAL CPUS 0x bit mask of local NUMA node Example for local NUMA node which its cores are 0 7 MLX4 LOCAL CPUS 0xff Additional modification can apply to impact this feature by changing the following environment variable MLX4 STALL NUM LOOP integer default 400 Note The default value is optimized for most applications However several applications might benefit from increasing decreasing this value Tuning for AMD
18. ill change several registry entries described below and will check for system services that might decrease network performance It will also generate a log including the applied changes Users can view this log to restore the previous values The log path is HOMEDRIVE Windows System3 LogFiles PerformanceTunning log This tuning is needed on one adapter only and only once after the installation as long as these entries are not changed directly in the registry or by some other installation or script 22 Performance Tuning Guidelines for Mellanox Network Adapters Revision 1 10 5 2 Tuning for NUMA Architecture 5 2 1 Tuning for Intel Microarchitecture Code name Sandy Bridge The Intel Sandy Bridge processor has an integrated PCI express controller Thus every PCIe adapter OS is connected directly to a NUMA node On a system with more than one NUMA node performance will be better when using the local NUMA node to which the PCIe adapter is connected 5 2 2 Tuning for AMD Architecture On AMD architecture there is a difference between a 2 socket system and a 4 socket system e With a2 socket system the PCIe adapter will be connected to socket 0 nodes 0 1 e With a4 socket system the PCIe adapter will be connected either to socket 0 nodes 0 1 or to socket 3 nodes 6 7 5 2 3 Running an Application on a Certain NUMA Node In order to run an application on a certain NUMA node the process affinity should be set in eit
19. including the loss of data system hang and you may need to reinstall Windows As such it is recommended to back up the registry on your system before implementing recommendations included in this document If the modifications you apply lead to serious problems you will be able to restore the original registry state For more details about backing up and restoring the registry please visit www microsoft com This document describes how to modify Windows registry parameters in order to improve Tuning the Network Adapter To improve the network adapter performance activate the performance tuning tool as follows Select Start gt Control Panel Open Network Connections 3 Right click on one of the entries Mellanox ConnectX Ethernet Adapter and select Properties Select the Performance tab 5 Choose one of the Tuning Scenarios Single port traffic Improves performance when running a single port traffic each time Dual port traffic Improves performance when running on both ports simultaneously Forwarding traffic Improves performance when running routing scenarios for example via IXIA Available in Mellanox WinOF v4 2 and above Multicast traffic Improves performance when the main traffic runs on multicast Available in Mellanox WinOF v4 2 and above Single stream traffic Optimizes tuning for applications with single connection Click the Run Tuning button Clicking the Run Tuning button w
20. lancer application from interfering with the interrupt affinity scheme the IRQ balancer must be turned off The following command turns off the IRQ balancer gt etc init d irqbalance stop The following command assigns the affinity of a single interrupt vector gt echo lt hexadecimal bit mask gt gt proc irg lt irq vector gt smp affinity Bit iin lt hexadecimal bit mask gt indicates whether processor core i is in lt irq vector gt s affinity or not 3 7 1 IRQ Affinity Configuration gt Note It is recommended to set each IRQ to a different core For Sandy Bridge or AMD systems set the irq affinity to the adapter s NUMA node e For optimizing single port traffic run set affinity bynode sh node gt lt interface gt e For optimizing dual port traffic run set irg ty lovnocdscin sintenacel gt e To show the current irq affinity settings run Show dies enni miley Sa see 17 Revision 1 10 Performance Tuning for Linux 3 7 2 Auto Tuning Utility MLNX_OFED 2 0 x introduces a new affinity tool called minx affinity This tool can automatically adjust your affinity settings for each network interface according to the system architecture Usage e Start mlnx affinity start e Stop mlnx affinity stop e Restart mlnx affinity restart minx affinity can also be started by driver load unload gt To enable
21. led CPU frequency select Max performance Memory Memory speed Max performance Memory channel mode Independent Node Interleaving Disabled NUMA Channel Interleaving Enabled Thermal Mode Performance Hyper Threading can increase message rate for multi process applications by having more logical cores It might increase the latency of a single process due to lower frequency of a single logical core when hyper threading is enabled N 11 Revision 1 10 Performance Tuning for Linux 3 Performance Tuning for Linux You can use the Linux sysctl command to modify default system network parameters that are set by the operating system in order to improve IPv4 and IPv6 traffic performance Note however that changing the network parameters may yield different results on different systems The results are significantly dependent on the CPU and chipset efficiency 3 1 Tuning the Network Adapter for Improved IPv4 Traffic Performance The following changes are recommended for improving IPv4 traffic performance e Disable the TCP timestamps option for better CPU utilization sysctl w net ipv4 tcp timestamps 0 e Enable the TCP selective acks option for better throughput SyocEl sw met iowa cece sacisi e Increase the maximum length of processor input queues sysctl w net core netdev max backlog 250000 e Increase the TCP maximum and default buffer sizes using setsockopt sysctl w net core rmem max 41943
22. mestamps 0 3 4 Tuning Power Management 3 4 1 Checking Core Frequency Check that the output CPU frequency for each core is equal to the maximum supported and that all core frequencies are consistent e Check the maximum supported CPU frequency cat sys devices system cpu cpu cpufreg cpuinfo max freq e Check that core frequencies are consistent cat proc cpuinfo grep cpu MHz e Check that the output frequencies are the same as the maximum supported If the CPU frequency is not at the maximum check the BIOS settings according to tables in section Recommended BIOS Settings on page 9 to verify that power state is disabled e Check the current CPU frequency to check whether it is configured to max available frequency cat sys devices system cpu cpu cpufreg cpuinfo cur freq Revision 1 10 Performance Tuning for Linux 3 4 2 Setting the Scaling Governor If the following modules are loaded CPU scaling is supported and you can improve performance by setting the scaling mode to performance e freq_table e acpi cpufreq this module is architecture dependent It is also recommended to disable the module cpuspeed this module is also architecture dependent To set the scaling mode to performance use echo performance gt sys devices system cpu cpu7 cpufreg scaling governor gt To disable cpuspeed use service cpuspeed stop 3 4 3 Kernel Idle Loop Tuning The mlx4_en kernel module has an optio
23. nal parameter that can tune the kernel idle loop for better latency This will improve the CPU wakeup time but may result in higher power consumption To tune the kernel idle loop set the following options in the etc modprobe d m1nx conf file Please be aware that if the file does not exist is must be created having the same name as the one stated above e For MLNX_OFED 2 0 x options mlx4 core enable sys tune 1 e ForMLNX EN 1 5 10 options mlx4 en enable sys tune 1 3 4 4 OS Controlled Power Management Some operating systems can override BIOS power management configuration and enable c states by default which results in a higher latency To resolve the high latency issue please follow the instructions below Edit the boot grub grub conf file or any other bootloader configuration file 2 Add the following kernel parameters to the bootloader command intel idle max cstate 0 processor max cstate l 3 Reboot the system Example title RH6 2x64 root hd0 0 kernel vmlinuz RH6 2x64 2 6 32 220 e16 x86 64 root UUID 817c207b c0e8 4ed9 9c33 c589c0bb566f console tty0 console tticyS0 11520018 ragg mez 0 processor ina Performance Tuning Guidelines for Mellanox Network Adapters Revision 1 10 3 5 Interrupt Moderation Interrupt moderation is used to decrease the frequency of network adapter interrupts to the CPU Mellanox network adapters use an adaptive interrupt moderation algorithm
24. ocesses You can improve performance for single threaded processes Change the threading parameter to VMA THREAD MODE 0 This setting helps to eliminate VMA locks and improve performance Reducing DMAs Reduce the number of DMAs direct memory access actions the NIC performs by using VMA TX SGE 5 If you are testing 128 Bytes UDP Multicast payload add headers 8 UDP 20 IP 14 MAC you need the inline value to be just above 170 Bytes VMA_TX_SGE 5 will give the best value in this case since it is more than the total packet size but very little extra buffer space wasted 20 Performance Tuning Guidelines for Mellanox Network Adapters Revision 1 10 4 Performance Tuning for Virtualized Environment 4 1 Tuning for Hypervisor It is recommended to configure the iommu to pass thru option in order to improve hypervisor performance gt To configure the iommu to pass thru option e Addto kernel parameters Intel iommu on iommu pt The virtualization service might enable the global IPv4 forwarding which in turn will cause all interfaces to disable their large receive offload capability gt To re enable large receive offload capability using ethtool ethtool K lt interface gt lro on Revision 1 10 Performance Tuning for Windows 5 5 1 Performance Tuning for Windows performance Please note that modifying the registry incorrectly might lead to serious problems
25. pages 3 9 2 Reducing Memory Footprint A smaller memory footprint reduces cache misses thereby improving performance Configure the following parameters to reduce the memory footprint e If your application uses small messages reduce MTU using VMA_MTU 200 e The default RX buffer is 200 KB Reduce your RX buffers to 30 60 KB using VMA_RX_BUFS 30000 Note This value must not be less than the value for VMA_RX_WRE 19 Revision 1 10 Performance Tuning for Linux 3 9 3 3 9 4 3 9 5 Polling Configurations You can improve performance by setting the following polling configurations Increase the number of times to unsuccessfully poll an Rx for VMA packets before going to sleep using VMA RX POLL 100000 This setting is recommended when Rx path latency is critical and CPU usage is not critical Increase the duration in micro seconds usec in which to poll the hardware on Rx path before blocking for an interrupt using VMA SELECT POLL 100000 This setting increases the number of times the selected path successfully receives poll hits which improves the latency and causes increased CPU utilization Disable the following polling parameters by setting their values to 0 VMA RX SKIP OS VMA RX POLL OS RATIO When disabled only offloaded sockets are polled VMA SELECT POLL OS RATIO When disabled only offloaded sockets are polled VMA SELECT SKIP OS Handling Single Threaded Pr
26. r 2012 e Added the following sections Recognizing NUMA Node Cores on page 17 Running an Application on a Certain NUMA Node on page 17 Running an Application a Certain NUMA Node on page 23 Recognizing NUMA Node Cores on page 23 Recognizing NUMA Node Cores on page 25 e Updated the following sections Tuning the Network Adapter on page 22 E a Performance Tuning Guidelines for Mellanox Network Adapters Revision 1 10 Revision Date Description 1 5 May 2012 e Added the following sections Memory Configuration on page 9 Tuning for IPoIB EoIB Interfaces on page 18 Kernel Idle Loop Tuning on page 14 e Updated the following sections IRQ Affinity Configuration on page 17 Recommended BIOS Settings on page 9 Tuning for Multiple Adapters on page 18 Tuning for Windows 2008 R2 on page 23 1 4 April 2012 e Added Tuning for NUMA Architecture sections e Rearranged section in chapter 3 1 3 March 2012 e Added new section Tuning Power Management 1 2 January 2012 e Updated versions of adapters to make the document more generic e Merged sections on BIOS Power Management Settings and Intel Hyper Threading Technology to new section Recommended BIOS Settings e Added sections to Performing Tuning for Linux e Added section Tuning for Windows 2008 R2
27. ystem components and or interfaces Different systems may have different features thus some recommendations below may not be applicable 2 1 PCI Express PCle Capabilities Table 2 Recommended PCle Configuration PCIe Generation 3 0 Speed 8GT s Width x8 or x16 Max Payload size 256 Max Read Request 4096 gt Note For ConnectX3 based network adapters 40GbE Ethernet adapters it is recommended to use an x16 PCle slot to benefit from the additional buffers allocated by the CPU 2 2 Memory Configuration For high performance it is recommended to use the highest memory speed with fewest DIMMs and populate all memory channels for every CPU installed For further information please refer to your vendor s memory configuration instructions or memory configuration tool available Online Revision 1 10 General System Configurations 2 3 Recommended BIOS Settings gt Note These performance optimizations may result in higher power consumption 2 3 1 General Set BIOS power management to Maximum Performance 2 3 2 Intel amp Sandy Bridge Processors The following table displays the recommended BIOS settings in machines with Intel code name Sandy Bridge based processors BIOS Option Values General Operating Mode Power profile Maximum Performance Processor C States Disabled Turbo mode Enabled Hyper Threading HPC disabled Data Centers
Download Pdf Manuals
Related Search
Related Contents
PDF - 育鵬社 Samsung SF-3000 Manual de Usuario "取扱説明書" Zebra® ZXP Series 8™ Kartendrucker No-Peep User Manual Manual de usuario Melissa 246-035 User's Manual Installation manual BIG Kaleidos 2011.1 Update 3243 manual do usuário Hoja Técnica - Vancontrols Copyright © All rights reserved.
Failed to retrieve file