Home
Sample Applications User Guide
Contents
1. sss 125 19 4 1 3 Application Arguments sss 126 19 4 1 4 Mbuf Pool Initialization eee 126 19 4 1 5 Ports Configuration and Pairing sceeeeeeeeee ee ee ees 126 19 4 1 6 Logical Cores Assignment sese 127 19 4 1 7 Receive Process and Transmit Packets 127 19 4 2 Control Application qwctl ssssssseee HH 130 19 4 2 1 Command Definitions sss 130 19 4 2 2 Accessing Shared Variables sss 130 20 Timer Sample ApplicatiOn sisi sees tokens pa hip er tna terae a densi deed sada Ra ck Pa ux ER PRA 131 20 1 Compiling the Application erre enint nhe hcm enn ek nen Ras 131 20 2 Running the ApplicatiOn eiie ierra entren nina sa danda nx ui Ra gea Ea Daun 131 20 3 Explanatio Mes nna ma meg ext dO CARO A Da SUR MEC pU Edd CREME 131 20 3 1 Initialization and Main Loop sse 131 20 3 2 Managing TIME S iiir essere nagd a Rr SERE aP EE kex E 132 21 VMDQ and DCB Forwarding Sample Application csse 134 PA MEEECIAU EE 134 21 2 Compiling the Application ssssssssesee nemen 135 21 3 RUNNING the Applications iiis desir raa eee Na die a 136 2124 JEXPIAMAUIOM eT Tm 136 21 4 1 Initialization iiie ierat Lee trema leno ce ee Rp ened RRR T RAXE RO ERR a PRATER 136 21 4 2 Statistics Display cn ana aani rackeadea cedat ne a erbe no cee 138 22 Vhost Sample Application or
2. x vEth u port id conf group id uintl6 t port id conf mbuf size MAX PACKET SZ The first KNI device associated to a port is the master for multiple kernel thread environment if a 0 1 struct rte kni ops ops struct rte eth dev info dev info memset amp dev info 0 sizeof dev info rte eth dev info get port id amp dev info conf addr dev info pci dev addr conf id dev info pci dev id Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 42 Document Number 328218 008 Kernel NIC Interface Sample Application i n tel memset amp ops 0 sizeof ops ops port_id port_id ops change mtu kni change mtu ops config network if kni config network interface kni rte kni alloc pktmbuf pool amp conf amp ops else kni rte kni alloc pktmbuf pool amp conf NULL if kni rte exit EXIT FAILURE Fail to create kni for port Sd n port id params port_id gt kni i kni return 0 The other step in the initialization process that is unique to this sample application is the association of each port with Icores for RX TX and kernel threads One Icore to read from the port and write to the associated one or more KNI devices Another Icore to read from one or more KNI devices and write to the port Other Icores for pinning the kernel threads on one by one This is done by using th
3. qavg port X subport Y tc Z Show average queue size per subport for a specific traffic class qavg port X subport Y pipe Z Show average queue size per pipe qavg port X subport Y pipe Z tc A Show average queue size per pipe for a specific traffic class qavg port X subport Y pipe Z tc A q B Show average queue size of a specific queue Example The following is an example command with a single packet flow configuration qos sched c a2 n 4 pfc 3 2 5 7 cfg profile cfg This example uses a single packet flow configuration which creates one RX thread on Icore 5 reading from port 3 and a worker thread on Icore 7 writing to port 2 Another example with 2 packet flow configurations using different ports but sharing the same core for QoS scheduler is given below qos sched c c6 n 4 pfc 3 2 2 6 7 pfc 1 0 2 6 7 cfg profile cfg Note that independent cores for the packet flow configurations for each of the RX WT and TX thread are also supported providing flexibility to balance the work The EAL coremask is constrained to contain the default mastercore 1 and the RX WT and TX cores only Explanation The Port Subport Pipe Traffic Class Queue are the hierarchical entities in a typical QoS application e ASsubport represents a predefined group of users e A pipe represents an individual user subscriber e A traffic class is the representation of a different traffic type with a speci
4. init RTE timer library to be used to initialize per core timers rte timer subsystem init per core initialization for lcore id 0 lcore id RTE MAX LCORE lcore id if rte lcore is enabled lcore id 0 continue init power management library for a specified core ret rte power init lcore id if ret rte exit EXIT FAILURE Power management library initialization failed on core d n lcore id init timer structures for each enabled lcore rte timer init amp power timers lcore id hz rte get hpet hz rte timer reset amp power timers lcore id hz TIMER NUMBER PER SECOND SINGLE lcore id power timer cb NULL 11 5 2 Monitoring Loads of Rx Queues In general the polling nature of the Intel DPDK prevents the OS power management subsystem from knowing if the network load is actually heavy or light In this sample sampling network load work is done by monitoring received and available descriptors on NIC Rx queues in recent polls Based on the number of returned and available Rx descriptors this example implements algorithms to generate frequency scaling hints and speculative sleep duration and use them to control P state and C state of processors via the power management library Frequency P state control and sleep state C state control work individually for each logical core and the combination of them
5. This command is composed of two tokens e The first token is a string token that can be show Or del e The second token is an object that was previously added using the add in the global obj list variable obj command Intel Data Plane Development Kit Intel DPDK June 2014 Sample Applications User Guide Document Number 328218 008 13 i n tel Command Line Sample Application Once the command is parsed the rte cmdline application fills a cmd obj del show result structure A pointer to this structure is given as an argument to the callback function and can be used in the body of this function 88 Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 14 Document Number 328218 008 Exception Path Sample Application n tel 3 Exception Path Sample Application The Exception Path sample application is a simple example that demonstrates the use of the Intel DPDK to set up an exception path for packets to go through the Linux kernel This is done by using virtual TAP network interfaces These can be read from and written to by the Intel DPDK application and appear to the kernel as a standard network interface 3 1 Overview The application creates two threads for each NIC port being used One thread reads from the port and writes the data unmodified to a thread specific TAP interface The second thread reads from a TAP interface and writes the data unmodified to the NIC por
6. Port Queue Icore Description 0 0 0 Map queue O0 from port 0 to Icore 0 0 1 2 Map queue 1 from port 0 to Icore 2 1 0 1 Map queue 0 from port 1 to Icore 1 1 1 3 Map queue 1 from port 1 to Icore 3 Refer to the Intel DPDK Getting Started Guide for general information on running applications and the Environment Abstraction Layer EAL options Explanation The following sections provide some explanation of the sample application code As mentioned in the overview section the initialization and run time paths are very similar to those of the L2 forwarding application see Chapter 9 L2 Forwarding Sample Application in Real and Virtualized Environments for more information The following sections describe aspects that are specific to the L3 Forwarding sample application Hash Initialization The hash object is created and loaded with the pre configured entries read from a global array and then generate the expected 5 tuple as key to keep consistence with those of real flow for the convenience to execute hash performance test on 4M 8M 16M flows The Hash initialization will setup both ipv4 and ipv6 hash table and populate the either epending on the value of variable ipv6 To support the hash performance test with up to 8M single direction flows 16M bi direction flows populate ipv4 many flow into table functio Note n will populate the hash table with specified hash table entry number default 4M Value of
7. rte exit res gt flags 0 pair gt flags 0 rte spinlock unlock amp res lock XIT FAILURE failed to reset slav l 0 When the slave process is spawned and starts to run it checks whether the floating process option is applied If so it clears the affinity to a specific core and also sets the unique core ID to 0 Then it tries to allocate a new core ID Since the core ID has changed the resource allocated by the master cannot work so it remaps the resource to the new core ID slot static int 12fwd launch one lcore attribute unused void dummy unsigned lcore id rte lcore id if float proc unsigned flcore id Change it to floating process also change it s lcore id clear cpu affinity RTE PER LCORE lcore id 0 Get a lcore id if flib assign lcore id lt 0 printf flib assign lcore id failed Win return 1 flcore id rte lcore id Set mapping id so master can return it after slav mapping id lcore id flcore id xited printf Org lcore id u cur lcore id uMn lcore id flcore id remapping slave resource lcore id flcore id 12fwd_main_loop return lcore id before return if float proc flib fr lcore id rte lcore id mapping id lcore id INVALID MAPPING ID return 0 June 2014 Intel Data Plane Development Kit Intel DPDK
8. stats app Shows a table with in app calculated statistics stats port X subport Y For a specific subport it shows the number of packets that went through the scheduler properly and the number of packets that were dropped The same information is shown in bytes The information is displayed in a table separating it in different traffic classes stats port X subport Y pipe Z Fora specific pipe it shows the number of packets that went through the scheduler properly and the number of packets that were dropped The same information is shown in bytes This information is displayed in a table separating it in individual queues e Average queue size Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 114 Document Number 328218 008 QoS Scheduler Sample Application i n te 18 3 2 18 4 June 2014 All of these commands work the same way averaging the number of packets throughout a specific subset of queues Two parameters can be configured for this prior to calling any of these commands qavg n X nis the number of times that the calculation will take place Bigger numbers provide higher accuracy The default value is 10 qavg period X period is the number of microseconds that will be allowed between each calculation The default value is 100 The commands that can be used for measuring average queue size are qavg port X subport Y Show average queue size per subport
9. 24 1 24 2 24 3 June 2014 Netmap Compatibility Sample Application Introduction The Netmap compatibility library provides a minimal set of APIs to give the ability to programs written against the Netmap APIs to be run with minimal changes to their source code using the Intel DPDK to perform the actual packet I O Since Netmap applications use regular system calls like open ioctl and mmap to communicate with the Netmap kernel module performing the packet I O the compat netmap library provides a set of similar APIs to use in place of those system calls effectively turning a Netmap application into a Intel DPDK one The provided library is currently minimal and doesn t support all the features that Netmap supports but is enough to run simple applications such as the bridge example detailed below Knowledge of Netmap is required to understand the rest of this section Please refer to the Netmap distribution for details about Netmap Available APIs The library provides the following drop in replacements for system calls usually used in Netmap applications rte netmap close e rte netmap ioctl e rte netmap open rte netmap mmap e rte netmap poll They use the same signature as their 1ibc counterparts and can be used as drop in replacements in most cases Caveats Given the difference between the way Netmap and the Intel DPDK approach packet I O there are caveats and limitations to be aware
10. Introduction Contents June 2014 INGFOGUCHOM a tictsvercecd saan deer ve iececdandicetiaes teddy Tags va eae AERA A E ed eae ned ENEA 10 1 1 Documentation Roadmap 2 cece cere rere etter eee riria aatas nene nnn 10 Command Line Sample Application sess 11 2 1 Over Vie W M 11 2 2 Compiling the Application eiie eee eene nnn nhan r nhan nnn ns 11 2 3 Rurining the AppliCatiOn ueiceir ern rn nre nter n xn Rm nn Cn mnn n ery 12 2 4 elEnubpet E PE 12 2 4 1 EAL Initialization and cmdline Start s sssssssssssssrserrsnrsrnsernrnesrrens 12 2 4 2 Defining a cmdline Context sis iiiiiri idriss niiair ena cama hans 12 Exception Path Sample Application cccceeeeeee cnet eee eee mmm nnn 15 3 1 OVGEVIBW x s iccicsskxe ode silent tn a e a E RON ER BURN E NEN EE ME RENE CU IRAE 15 3 2 Compiling the Application sess 16 3 3 Running the Application asifat neieie di terti ce a tren area exu nk ka da Pad Ra Ue 16 3 3 1 GETTING siu di ufo te c m 16 3 4 siis 17 3 4 1 INGA rAz o ON rr EE E 17 3 4 2 Packet Forwarding suce tactics etri non rmx RR nnn te nnn aa nn eii d a pa tan 18 3 4 3 Managing TAP Interfaces and BridGe sS ccceeeeeeeeeee eee eeeeeeeeeeas 19 Hello World Sample Application ccccceeeee eect ener mme 21 4 1 compiling the Application occu eir dene devtacares
11. Multi process Sample Application i n te I 16 1 5 16 1 5 1 Master slave Multi process Example The fourth example of Intel DPDK multi process support demonstrates a master slave model that provide the capability of application recovery if a slave process crashes or meets unexpected conditions In addition it also demonstrates the floating process which can run among different cores in contrast to the traditional way of binding a process thread to a specific CPU core using the local cache mechanism of mempool structures This application performs the same functionality as the L2 Forwarding sample application therefore this chapter does not cover that part but describes functionality that is introduced in this multi process example only Please refer to Chapter 9 L2 Forwarding Sample Application in Real and Virtualized Environments for more information Unlike previous examples where all processes are started from the command line with input arguments in this example only one process is spawned from the command line and that process creates other processes The following section describes this in more detail Master slave Process Models The process spawned from the command line is called the master process in this document A process created by the master is called a slave process The application has only one master process but could have multiple slave processes Once the master process begins to run it tries to
12. if timer period gt 0 advance the timer timer tse diff tse if timer has reached its timeout if unlikely timer tsc gt uint64 t timer period do this only on master core if lcore id rte get master lcore print stats Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 56 Document Number 328218 008 L2 Forwarding Sample Application in Real and Virtualized Environments n tel reset the timer timer tsc 0 prev_tsc cur_tsc 88 Intel Data Plane Development Kit Intel DPDK June 2014 Sample Applications User Guide Document Number 328218 008 57 i n tel L3 Forwarding Sample Application 10 L3 Forwarding Sample Application The L3 Forwarding application is a simple example of packet processing using the Intel DPDK The application performs L3 forwarding 10 1 Overview The application demonstrates the use of the hash and LPM libraries in the Intel DPDK to implement packet forwarding The initialization and run time paths are very similar to those of the L2 forwarding application see Chapter 9 L2 Forwarding Sample Application in Real and Virtualized Environments for more information The main difference from the L2 Forwarding sample application is that the forwarding decision is made based on information read from the input packet The lookup method is either hash based or LPM based and is selected at com
13. 96 Document Number 328218 008 Multi process Sample Application i n tel Figure 7 Example Data Flow in a Symmetric Multi process Application 16 1 3 1 June 2014 Port O Process 0 As the diagram shows each process reads packets from each of the network ports in use RSS is used to distribute incoming packets on each port to different hardware RX queues Each process reads a different RX queue on each port and so does not contend with any other process for that queue access Similarly each process writes outgoing packets to a different TX queue on each port Running the Application As with the simple_mp example the first instance of the symmetric_mp process must be run as the primary instance though with a number of other application specific parameters also provided after the EAL arguments These additional parameters are e p lt portmask gt where portmask is a hexadecimal bitmask of what ports on the system are to be used For example p 3 to use ports 0 and 1 only num procs N Where N is the total number of symmetric mp instances that will be run side by side to perform packet processing This parameter is used to configure the appropriate number of receive queues on each network port e proc id n where n is a numeric value in the range 0 lt n lt N number of processes specified above This identifies which symmetric mp instance is being run so that each process can read a unique rece
14. C States are also known as sleep states They allow software to put an Intel core into a low power idle state from which it is possible to exit via an event such as an interrupt However there is a tradeoff between the power consumed in the idle state and the time required to wake up from the idle state exit latency Therefore as you go into deeper C states the power consumed is lower but the exit latency is increased Each C state has a target residency It is essential that when entering into a C state the core remains in this C state for at least as long as the target residency in order to fully realize the benefits of entering the C state CPUIdle is the infrastructure provide by the Linux kernel to control the processor C state capability Unlike CPUFreq CPUIdle does not provide a mechanism that allows the application to change C state It actually has its own heuristic algorithms in kernel space to select target C state to enter by executing privileged instructions like HLT and MWAIT based on the speculative sleep duration of the core In this application we introduce a heuristic algorithm that allows packet processing cores to sleep for a short period if there is no Rx packet received on recent polls In this way CPUIdle automatically forces the corresponding cores to enter deeper C states instead of always running to the CO state waiting for packets Note To fully demonstrate the power saving capability of using C states it is recomme
15. Figure 17 Figure 18 Figure 19 Figure 20 Figure 21 Tables Table 1 Table 2 Table 3 Pipeline OVERVIEW e H 122 Ring based Processing Pipeline Performance Setup sss 123 Threads and Pipelines 2 deii rh epe Ires a PUO RR E RID ERU RU RUNE 128 Packet Flow Through the VMDQ and DCB Sample Application ssssssssse 135 QEMU Virtio net prior to vhost net cessisse aiai iaaa a 139 Virtio with Linux Kernel VhoSt i iceeei csse team risp nutexk t nicus endi Kx Fina AR Rae 140 Vhost net Architectural Overview 2 eieieieeee sese see san hana nkh khan n aa nu Ra HERR ARR RR Ra YR 141 Packet Flow Through the vhost net Sample Application sees 142 Packet Flow on TX in DPDK testpmd sssssssseseen mmm nene nene 151 Test Pipeline ApplicatiOh 2 2 rtt n nnn naeh nn aa Xe HR APR aaa 158 Output Traffic Marking iie iier retro t te trn neta nhe a afa iR Re rei eain Pata da 110 Entity TYPES crisi arnai a a KEEPER FI ive Ee EER a a E M Rica weg 116 IJSPIG e 159 Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 8 Document Number 328218 008 Introduction n te j Revision History me rm pem n June 2014 008 Supports public software release 1 7 0 e Added Section 12 0 L3 Forwarding with Access Control Sample Application February
16. For example 192 168 0 34 32 where the address is 192 168 0 34 and the prefix length is 32 Ports are specified as a range of 16 bit numbers in the format MIN MAX where MIN and MAX are the inclusive minimum and maximum values of the range The range 0 65535 represents all possible ports in a range When MIN and MAX are the same value a single port is represented for example 20 20 The protocol identifier is an 8 bit value and a mask separated by For example 6 0xfe matches protocol values 6 and 7 e Route rules start with a leading character R and have the same format as ACL rules except an extra field at the tail that indicates the forwarding port number Rules File Example Figure 5 is an example of a rules file This file has three rules one for ACL and two for route information Example Rules File Source Address Destination Address Source Port Dest Port Protocol Fwd icum pe Sa Ooo er CON 1 2 3 0 24 192 168 0 36 32 0 65535 0 65535 6 Oxfe RO 0 0 0 0 192 168 0 36 320 65535 0 65535 6 Oxfe 1 RO 0 0 0 0 0 0 0 0 0 0 655350 655350x0 0x0 0 Each rule is explained as follows e Rule 1 the first line tells the application to drop those packets with source IP address 1 2 3 destination IP address 192 168 0 36 protocol 6 7 e Rule 2 the second line is similar to Rule 1 except the source IP address is ignored It tells the application to forward packets with destination IP address 192 168 0 36
17. f for portid 0 portid lt nb ports portid skip ports that are not enabled if l1si enabled port mask amp 1 lt lt portid 0 continue save the destination port id if nb ports in mask 2 lsi dst ports portid portid last lsi dst ports portid last portid else portid last portid nb ports in mask rte eth dev info get uint8 t portid amp dev info Observe that e rte eal pci probe parses the devices on the PCI bus and initializes recognized devices The next step is to configure the RX and TX queues For each port there is only one RX queue only one Icore is able to poll a given port The number of TX queues depends on the number of available Icores The rte eth dev configure function is used to configure the number of queues for a port ret rte eth dev configure uint8 t portid 1 1 amp port conf if ret 0 rte exit EXIT FAILURE Cannot configure device err d port u n ret portid The global configuration is stored in a static structure static const struct rte eth conf port conf rxmode Split hdr size header split hw ip checksum hw vlan filter hw strip crc Il ool lo t 0 lt Header Split disabled lt IP checksum offload disabled lt VLAN filtering disabled 0 lt CRC stripped by hardware txmode intr conf lsc 1 link status in
18. 328218 008 41 n tel Kernel NIC Interface Sample Application 8 6 1 Initialization Setup of mbuf pool driver and queues is similar to the setup done in the L2 Forwarding sample application see Chapter 9 L2 Forwarding Sample Application in Real and Virtualized Environments for details In addition one or more kernel NIC interfaces are allocated for each of the configured ports according to the command line parameters The code for creating the kernel NIC interface for a specific port is as follows kni rte kni create port MAX PACKET SZ pktmbuf pool amp kni ops if kni NULL rte exit EXIT FAILURE Fail to create kni dev for port dMn port The code for allocating the kernel NIC interfaces for a specific port is as follows static int kni alloc uint8 t port id uint8 t i struct rte_kni kni struct rte kni conf conf struct kni port params params kni port params array if port id RTE MAX ETHPORTS params port id return 1 params port id nb kni params port id nb lcore k params port id nb lcore k 1 for i 0 i lt params port id nb kni i Clear conf at first memset amp conf 0 sizeof conf if params port id nb lcore k rte snprintf conf name RTE KNI NAMESIZE vEth u u port id i conf core id params port id lcore k i conf force bind 1 else rte snprintf conf name RTE KNI NAMESIZE
19. 4096 e flowttl TTL s ms determines maximum Time To Live for fragmented packet If all fragments of the packet wouldn t appear within given time out then they are consirdered as invalid and will be dropped Valid range is 1ms 3600s Default value 1s To run the example in linuxapp environment with 2 Icores 2 4 over 2 ports 0 2 with 1 RX queue per Icore build ip reassembly c 0x14 n 3 p 5 EAL coremask set to 14 EAL Detected lcore 0 on socket 0 EAL Detected lcore 1 on socket 1 EAL Detected lcore 2 on socket 0 EAL Detected lcore 3 on socket 1 EAL Detected lcore 4 on socket 0 Initializing port 0 on lcore 2 Address 00 1B 21 76 FA 2C rxq 0 txq 2 0 txq 4 1 done Link Up speed 10000 Mbps full duplex Skipping disabled port 1 Initializing port 2 on lcore 4 Address 00 1B 21 5C FF 54 rxq 0 txq 2 0 txq 4 1 done Link Up speed 10000 Mbps full duplex Skipping disabled port 3IP FRAG Socket 0 adding route 100 10 0 0 16 port 0 IP RSMBL Socket 0 adding route 100 20 0 0 16 port 1 IP RSMBL Socket 0 adding route 0101 0101 0101 0101 0101 0101 0101 0101 48 port 0 IP_RSMBL Socket 0 adding route 0201 0101 0101 0101 0101 0101 0101 0101 48 port 1 IP_RSMBL entering main loop on lcore 4 IP_RSMBL lcoreid 4 portid 2 IP RSMBL entering main loop on lcore 2 IP RSMBL lcoreid 2 portid 0 To run the example in linuxapp environment with 1 Icore 4 ove
20. At run time core A is creating the following lookup key and storing it into the packet meta data for core B to use for table lookup destination IPv4 address 4 bytes of 0 hash spec 16 lru LRU hash table with 16 byte key size and 16 million entries hash spec ash 16 ext Extendible bucket hash table with 16 byte key size and 16 million entries 16 million entries are successfully added to the hash table with the following key format 4 byte index 12 bytes of 0 The action configured for all table entries is Send to output port with the output port index uniformly distributed for the range of output ports The default table rule used in the case of a lookup miss is to drop the packet At run time core A is creating the following lookup key and storing it into the packet meta data for core B to use for table lookup destination IPv4 address 12 bytes of 0 hash spec 32 Iru LRU hash table with 32 byte key size and 16 million entries hash spec 32 ext Extendible bucket hash table with 32 byte key size and 16 million entries 16 million entries are successfully added to the hash table with the following key format 4 byte index 28 bytes of 0 The action configured for all table entries is Send to output port with the output port index uniformly distributed for the range of output ports The default table rule used in the case of a lookup m
21. DPDK Sample Application User Guide June 2014 78 Document Number 328218 008 m L3 Forwarding in a Virtualization Environment Sample Application n tel 13 2 Compiling the Application To compile the application 1 Go to the sample application directory export RTE SDK path to rte sdk cd RTE_SDK examples 13fwd vf 2 Set the target a default target is used if not specified For example export RTE TARGET x86 64 native linuxapp gcc See the Intel DPDK Getting Started Guide for possible RTE TARGET values 3 Build the application make Note The compiled application is written to the build subdirectory To have the application written to a different location the 0 path to build directory option may be specified in the make command 13 3 June 2014 Running the Application The application has a number of command line options build l3fwd vf EAL options p PORTMASK config port queue lcore port queue lcore no numa where e p PORTMASK Hexadecimal bitmask of ports to configure e config port queue lcore port queue lcore determines which queues from which ports are mapped to which cores e no numa optional disables numa awareness For example consider a dual processor socket platform where cores 0 2 4 6 8 and 10 appear on socket 0 while cores 1 3 5 7 9 and 11 appear on socket 1 Let s say that the programmer wants to use memory
22. For the cryptographic path cases D 1 or D 2 byte C specifies the cipher algorithm and byte B the cryptographic hash algorithm to be used for the current packet Byte A is not used and can be any value The cipher and cryptographic hash algorithms supported by this application are listed in the crypto h header file For each input packet the destination NIC TX port is decided at the forwarding stage executed after the cryptographic stage if enabled for the packet by looking at the RX port index of the dst ports array which was initialized at startup being the outport the adjacent enabled port For example if ports 1 3 5 and 6 are enabled for input port 1 outport port will be 3 and vice versa and for input port 5 output port will be 6 and vice versa For the cryptographic path it is the payload of the IPv4 packet that is encrypted or decrypted 19 1 1 Setup Building and running this application requires having both the Intel DPDK package and the QuickAssist Technology Software Library installed as well as at least one Intel QuickAssist Technology hardware device present in the system For more details on how to build and run Intel DPDK and Intel QuickAssist Technology applications please refer to the following documents e Intel DPDK Getting Started Guide e Intel Communications Chipset 8900 to 8920 Series Software for Linux Getting Started Guide 440005 Intel Data Plane Development Kit Intel DPDK S
23. e max pkt len optional maximum packet length in decimal 64 9600 e no numa Optional disables numa awareness See Chapter 10 L3 Forwarding Sample Application for details The L3fwd power example reuses the L3fwd command line options 11 5 Explanation The following sections provide some explanation of the sample application code As mentioned in the overview section the initialization and run time paths are identical to those of the L3 forwarding application The following sections describe aspects that are specific to the L3 Forwarding with Power Management sample application 11 5 1 Power Library Initialization The Power library is initialized in the MAIN routine It changes the P state governor to userspace for specific cores that are under control The Timer library is also initialized and several timers are created later on responsible for checking if it needs to scale down frequency at run time by checking CPU utilization statistics Note Only the power management related initialization is shown int MAIN int argc char argv struct lcore conf qconf int ret unsigned nb ports Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 66 Document Number 328218 008 L3 Forwarding with Power Management Sample Application n tel uintl6 t queueid unsigned lcore id uint64 t hz uint32 t n tx queue nb lcores uint8 t portid nb rx queue queue socketid
24. hdr gt pkt nb segs uint8 t pkt gt pkt nb segs 1 Intel Data Plane Development Kit Intel DPDK June 2014 Sample Applications User Guide Document Number 328218 008 31 n tel IPv4 Multicast Sample Application copy metadata from source packet hdr pkt in port pkt pkt in port hdr pkt vlan macip pkt pkt vlan macip hdr gt pkt hash pkt pkt hash hdr gt ol flags pkt gt ol_flags rte mbuf sanity check hdr RTE MBUF PKT 1 return hdr 88 Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 32 Document Number 328218 008 IP Reassembly Sample Application i n te 7 IP Reassembly Sample Application The L3 Forwarding application is a simple example of packet processing using the Intel DPDK The application performs L3 forwarding with reassembly for fragmented IPv4 and IPv6 packets 7 1 Overview The application demonstrates the use of the Intel DPDK libraries to implement packet forwarding with reassembly for IPv4 and IPv6 fragmented packets The initialization and run time paths are very similar to those of the L2 forwarding application see Chapter 9 L3 Forwarding Sample Application for more information The main difference from the L2 Forwarding sample application is that it reassembles fragmented IPv4 and IPv6 packets before forwarding The maximum allowed size of reassembled packet is 9 5 KB There are two key differences f
25. table and so on bring NIC ports up or down and so on Compiling the Application 1 Goto the examples directory export RTE SDK path to rte sdk cd RTE_SDK examples ip pipeline 2 Setthe target a default target is used if not specified export RTE TARGET x86 64 native linuxapp gcc 3 Build the application make Running the Sample Code The application execution command line is ip pipeline EAL options p PORTMASK f CONFIG FILE The number of ports in the PORTMASK can be either 2 or 4 The config file assigns functionality to the CPU core by deciding the pipeline type to run on each CPU core e g master RX flow classification firewall routing IP fragmentation IP reassembly TX and also allows creating complex topologies made up of CPU cores by interconnecting the CPU cores through SW queues Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 156 Document Number 328218 008 Internet Protocol IP Pipeline Sample Application i n tel Once the application is initialized the CLI is available for populating the application tables bringing NIC ports up or down and so on The flow classification pipeline implements the flow table by using a large multi million entry hash table with a 16 byte key size The lookup key is the IPv4 5 tuple which is extracted from the input packet by the packet RX pipeline and saved in the packet meta data has the following
26. 1 2 1 5 1 3 0 0 0 14 1 0 0 3 2 6 0 4 0 0 0 15 1 0 1 4 3 7 1 NUMA Support The application has built in performance enhancements for the NUMA case 1 One buffer pool per each CPU socket 2 One LPM table per each CPU socket 3 Memory for the NIC RX or TX rings is allocated on the same socket with the Icore handling the respective ring In the case where multiple CPU sockets are used in the system it is recommended to enable at least one Icore to fulfil the I O role for the NIC ports that are directly Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 92 Document Number 328218 008 Load Balancer Sample Application i n tel attached to that CPU socket through the PCI Express bus It is always recommended to handle the packet I O with Icores from the same CPU socket as the NICs June 2014 Depending on whether the I O RX Icore same CPU socket as NIC RX the worker Icore and the I O TX Icore same CPU socket as NIC TX handling a specific input packet are on the same or different CPU sockets the following run time scenarios are possible 1 AAA The packet is received processed and transmitted without going across CPU Sockets AAB The packet is received and processed on socket A but as it has to be transmitted on a NIC port connected to socket B the packet is sent to socket B through software rings ABB The packet is received on socket A but as it
27. 12fwd 2 Setthe target a default target is used if not specified For example export RTE TARGET x86 64 native linuxapp gcc See the Intel DPDK Getting Started Guide for possible RTE TARGET values 3 Build the application make Running the Application The application requires a number of command line options build l2fwd EAL options p PORTMASK q NQ where e p PORTMASK A hexadecimal bitmask of the ports to configure e q NQ A number of queues ports per Icore default is 1 To run the application in linuxapp environment with 4 Icores 16 ports and 8 RX queues per Icore issue the command build l2fwd c f n 4 q 8 p ffff Refer to the Intel DPDK Getting Started Guide for general information on running applications and the Environment Abstraction Layer EAL options Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 50 Document Number 328218 008 L2 Forwarding Sample Application in Real and Virtualized Environments i n tel 9 4 Explanation The following sections provide some explanation of the code 9 4 1 Command Line Arguments The L2 Forwarding sample application takes specific parameters in addition to Environment Abstraction Layer EAL arguments see Section 9 3 The preferred way to parse parameters is to use the getopt function since it is part of a well defined and portable library The parsing of ar
28. 2 1 0 1 1 1 3 rule ipv4 rule ipv4 db rule ipv6 rule ipv6 db scalar In this command e The c option enables cores 0 1 2 3 e The p option enables ports 0 and 1 e The config option enables two queues on each port and maps each port queue pair to a specific core Logic to enable multiple RX queues using RSS and to allocate memory from the correct NUMA nodes is included in the application and is done transparently The following table shows the mapping in this example Port Queue Icore Description 0 0 0 Map queue 0 from port 0 to Icore 0 0 1 2 Map queue 1 from port 0 to Icore 2 1 0 1 Map queue 0 from port 1 to Icore 1 1 1 3 Map queue 1 from port 1 to Icore 3 e The rule ipv4 option specifies the reading of IPv4 rules sets from the rule ipv4 db file e The rule ipv6 option specifies the reading of IPv6 rules sets from the rule ipv6 db file e The scalar option specifies the performing of rule lookup with a scalar function Explanation The following sections provide some explanation of the sample application code The aspects of port device and CPU configuration are similar to those of the L3 forwarding application see Chapter 10 L3 Forwarding Sample Application for more information The following sections describe aspects that are specific to L3 forwarding with access control Parse Rules from File As described earlier both ACL and route rules are ass
29. 2014 007 Supports pubic software release 1 7 0 EA1 e Added Section 24 0 Internet Protocol IP Pipeline Sample Application e Added Section 25 0 Test Pipeline Application January 2014 Supports public software release 1 6 0 e Added Section 22 0 Vhost Sample Application e Added Section 23 0 Netmap Compatibility Sample Application Intel Data Plane Development Kit Intel DPDK June 2014 Sample Applications User Guide Document Number 328218 008 9 i n tel Introduction 1 Introduction This document describes the sample applications that are included in the Intel Data Plane Development Kit Intel DPDK Each chapter describes a sample application that showcases specific functionality and provides instructions on how to compile run and use the sample application 1 1 Documentation Roadmap The following is a list of Intel DPDK documents in suggested reading order e Release Notes Provides release specific information including supported features limitations fixed issues known issues and so on Also provides the answers to frequently asked questions in FAQ format e Getting Started Guides Describes how to install and configure the Intel DPDK software for your operating system designed to get users up and running quickly with the software e Programmer s Guide Describes The software architecture and how to use it through examples specifically in a Linux application 1inuxapp environment
30. 3 and 4 the following commands could be used mp server build mp server c 6 n 4 p 3 n2 mp client build mp client c 8 n 4 proc type auto n 0 mp client build mp client c 10 n 4 proc type auto n 1 Note If the server application dies and needs to be restarted all client applications also need to be restarted as there is no support in the server application for it to run as a secondary process Any client processes that need restarting can be restarted without affecting the server process 16 1 4 2 How the Application Works The server process performs the network port and data structure initialization much as the symmetric multi process application does when run as primary One additional enhancement in this sample application is that the server process stores its port configuration data in a memory zone in hugepage shared memory This eliminates the need for the client processes to have the portmask parameter passed into them on the command line as is done for the symmetric multi process application and therefore eliminates mismatched parameters as a potential source of errors In the same way that the server process is designed to be run as a primary process instance only the client processes are designed to be run as secondary instances only They have no code to attempt to create shared memory objects Instead handles to all needed rings and memory pools are obtained via calls to rte ring lookup and
31. 4 Command Line ArgQument iei saa then eina pa nuns 82 13 4 2 Mbuf Pool Initialization sssseseseseee enne 82 13 4 3 Driver Initialization ccc cece cece nnne nana nnn nnn nnn nia 82 13 4 4 Interrupt Callback Registration sss 84 13 4 5 RX Queue Initialization c esee 85 13 4 6 TX Queue Initialization sess 85 13 4 7 Receive Process and Transmit Packets cccccceseeeeeseeeeeeeeeeeeaees 86 Load Balancer Sample Application cceceeeeee eens eee eee mmn 89 TEL OV I n 89 14 1 1 I O RX Logical COPS isisi case veces ise aaia a Toa ca e een eur nda Rus 90 14 1 2 T O TX Logical Cores iiie ical ese duskre rade peditem aa PU REFRON E RETIA 90 14 1 3 Worker Logical Cores ssssssssssseeeneemem emen 90 14 2 Compiling the Application eiii cece teinte hee tantuni then rnt nhan nen E ene ener nx 90 14 3 Running the Application ise iiic inrer nah a enis cenar kk ae nix e RR adire 91 DE MEL 91 14 4 1 Application Configuration ssssssseeeHHHmnne 91 14 4 2 NUMA SUppOFt crcen cece teste dndcewines nnno ern rnt E DR Eae aru yn nux ai 92 Multi process Sample Application ssssssssssses menm nnn 94 15 1 Example Applicatiohs eoim hte n neat ca n nh re Daran ca D net exa dE ater ka 94 15 1 1 Building the Sample Applications sesesesn mm 94 15 1 2
32. B C D random source IP address fixed to 0 source TCP port fixed to 0 destination TCP port fixed to 0 e run cmd file path Read CLI commands from an external file and run them one by one The full list of the available CLI commands can be displayed by pressing the TAB key while the application is running 88 Intel Data Plane Development Kit Intel DPDK June 2014 Sample Applications User Guide Document Number 328218 008 157 Test Pipeline Application 26 Test Pipeline Application The Test Pipeline application illustrates the use of the Intel DPDK Packet Framework tool suite Its purpose is to demonstrate the performance of single table Intel DPDK pipelines 26 1 Overview The application uses three CPU cores Figure 22 Core A RX core receives traffic from the NIC ports and feeds core B with traffic through SW queues Core B Pipeline core implements a single table Intel DPDK pipeline whose type is selectable through specific command line parameter Core B receives traffic from core A through software queues processes it according to the actions configured in the table entries that are hit by the input packets and feeds it to core C through another set of software queues Core C TX core receives traffic from core B through software queues and sends it to the NIC ports for transmission Test Pipeline Application 26 2 Compiling the Application 1 Go to the app test directo
33. CORE Packet flow configuration Multiple pfc entities can be configured in the command line having 4 or 5 items if TX core defined or not Optional application parameters include e i It makes the application to start in the interactive mode In this mode the application shows a command line that can be used for obtaining statistics while scheduling is taking place see interactive mode below for more information e mst n Master core index the default value is 1 YysSz A B C Ring sizes A Size in number of buffer descriptors of each of the NIC RX rings read by the I O RX Icores the default value is 128 B Size in number of elements of each of the software rings used by the I O RX Icores to send packets to worker Icores the default value is 8192 C Size in number of buffer descriptors of each of the NIC TX rings written by worker Icores the default value is 256 e bsz A B C D Burst sizes A I O RX Icore read burst size from the NIC RX the default value is 64 Intel Data Plane Development Kit Intel DPDK Sample Application User Guide 112 June 2014 Document Number 328218 008 QoS Scheduler Sample Application i n tel B I O RX core write burst size to the output software rings worker Icore read burst size from input software rings QoS enqueue size the default value is 64 C QoS dequeue size the default value is 32 D Worker Icore
34. Data Plane Development Kit Intel DPDK Sample Applications User Guide Document Number 328218 008 149 i n tel Vhost Sample Application 23 7 5 23 8 dev ptmx dev kvm dev kqemu dev rtc dev hpet dev net tun dev devbase name index e Disable SELinux or set to permissive mode e Mount cgroup device controller user target mkdir dev cgroup user target mount t cgroup none dev cgroup o devices e Restart the 1ibvirtd system process For example on Fedora systemctl restart libvirtd service e Edit the configuration parameters section of the script Configure the emul path variable to point to the QEMU emulator emul path usr local bin qemu system x86 64 Configure the us vhost path variable to point to the Intel DPDK vhost net sample code s character devices name Intel DPDK vhost net sample code s character device will be in the format dev basename index us vhost path dev usvhost 1 Common Issues QEMU failing to allocate memory on hugetlbfs file ram alloc can t mmap RAM pages Cannot allocate memory When running QEMU the above error implies that it has failed to allocate memory for the Virtual Machine on the hugetlbfs This is typically due to insufficient hugepages being free to support the allocation request The number of free hugepages can be checked as follows user target cat sys kernel mm hugepages hugepages
35. FOREACH i if input cores mask amp 1ULL lt lt i Skip ports that are not enabled while ports mask amp 1 lt lt rx port 0 rx porttt if rx port sizeof ports mask 8 goto fail not enough ports port ids i rx port t else if output cores mask amp 1ULL lt lt i Skip ports that are not enabled while ports mask amp 1 lt lt tx port 0 tx porttt if tx port sizeof ports mask 8 goto fail not enough ports port ids i tx port Packet Forwarding After the initialization steps are complete the main loop function is run on each Icore This function first checks the 1core id against the user provided input cores mask and output cores mask to see if this core is reading from or writing to a TAP interface For the case that reads from a NIC port the packet reception is the same as in the L2 Forwarding sample application see Section 9 4 6 Receive Process and Transmit Packets on page 54 The packet transmission is done by calling write with the file descriptor of the appropriate TAP interface and then explicitly freeing the mbuf back to the pool Loop forever reading from NIC and writing to tap for 77 struct rte mbuf pkts burst PKT BURST S2 unsigned i const unsigned nb rx rte eth rx burst port ids lcore id 0 pkts burst PKT BURST SZ lcore stats lcore id rx nb rx for i 0 likely i lt nb r
36. GREEN The packet s color is changed to green e YELLOW The packet s color is changed to yellow e RED The packet s color is changed to red e DROP The packet is dropped In this particular case e Every packet which input and output color are the same keeps the same color e Every packet which color has improved is dropped this particular case can t happen so these values will not be used e For the rest of the cases the color is changed to red 88 Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 110 Document Number 328218 008 QoS Scheduler Sample Application n t el 18 QoS Scheduler Sample Application The QoS sample application demonstrates the use of the Intel DPDK to provide QoS scheduling 18 1 Overview The architecture of the QoS scheduler application is shown in the following figure Figure 11 QoS Scheduler Application Architecture CPU Core RX CPU Core Traffic Mgmt TX CPU Core RX CPU Core Traffic Mgmt CPU Core TX There are two flavors of the runtime execution for this application with two or three threads per each packet flow configuration being used The RX thread reads packets from the RX port classifies the packets based on the double VLAN outer and inner and the lower two bytes of the IP destination address and puts them into the ring queue The worker thread dequeues the packets from the ring and calls the QoS scheduler enqueue dequeu
37. Number 328218 008 Load Balancer Sample Application i n tel 15 3 15 4 15 4 1 June 2014 Running the Application To successfully run the application the command line used to start the application has to be in sync with the traffic flows configured on the traffic generator side For examples of application command lines and traffic generator flows please refer to the Intel DPDK Test Report For more details on how to set up and run the sample applications provided with Intel DPDK package please refer to the Inte DPDK Getting Started Guide Explanation Application Configuration The application run time configuration is done through the application command line parameters Any parameter that is not specified as mandatory is optional with the default value hard coded in the main h header file from the application folder The list of application command line parameters is listed below 1 rx PORT QUEUE LCORE The list of NIC RX ports and queues handled by the I O RX Icores This parameter also implicitly defines the list of I O RX Icores This is a mandatory parameter 2 tx PORT LCORE The list of NIC TX ports handled by the I O TX Icores This parameter also implicitly defines the list of I O TX Icores This is a mandatory parameter 3 w LCORE The list of the worker Icores This is a mandatory parameter 4 lpm IP PREFIX PORT The list of LPM rules used by the worke
38. Perform the following tasks Parse command parameters Check the validity of rule file s name s number of logical cores receive and transmit queues Bind ports queues and logical cores Check ACL search options and so on A Call Environmental Abstraction Layer EAL and Poll Mode Driver PMD functions to initialize the environment and detect possible NICs The EAL creates several threads and sets affinity to a specific hardware thread CPU based on the configuration specified by the command line arguments Read the rule files and format the rules into the representation that the ACL library can recognize Call the ACL library function to add the rules into the database and compile them as a trie of pattern sets Note that application maintains a separate AC contexts for IPv4 and IPv6 rules e Runtime Phase Process the incoming packets from a port Packets are processed in three steps Retrieval Gets a packet from the receive queue Each logical core may process several queues for different ports This depends on the configuration specified by command line arguments Lookup Checks that the packet type is supported IPv4 IPv6 and performs a 5 tuple lookup over corresponding AC context If an ACL rule is matched the packets will be dropped and return back to step 1 If a route rule is matched it indicates the packet is not in the ACL list and should be forwarded If there is no matches for the packet then the pac
39. Sample Applications User Guide Document Number 328218 008 107 n tel QoS Metering Sample Application 17 17 1 17 2 QoS Metering Sample Application The QoS meter sample application is an example that demonstrates the use of Intel DPDK to provide QoS marking and metering as defined by RFC2697 for Single Rate Three Color Marker srTCM and RFC 2698 for Two Rate Three Color Marker trTCM algorithm Overview The application uses a single thread for reading the packets from the RX port metering marking them with the appropriate color green yellow or red and writing them to the TX port A policing scheme can be applied before writing the packets to the TX port by dropping or changing the color of the packet in a static manner depending on both the input and output colors of the packets that are processed by the meter The operation mode can be selected as compile time out of the following options e Simple forwarding e SrTCM color blind e SrTCM color aware e SrTCM color blind e srTCM color aware Please refer to RFC2697 and RFC2698 for details about the srTCM and trTCM configurable parameters CIR CBS and EBS for srTCM CIR PIR CBS and PBS for trTCM The color blind modes are functionally equivalent with the color aware modes when all the incoming packets are colored as green Compiling the Application 1 Go to the example directory export RTE SDK path to rte sdk cd S RTE SDK examples qos met
40. This process is automated in the QEMU wrapper script discussed in Section 23 7 3 QEMU Wrapper Script The QEMU wrapper script automatically detects and calls QEMU with the necessary parameters required to integrate with the vhost sample code It performs the following actions e Automatically detects the location of the hugetlbfs and inserts this into the command line parameters Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 148 Document Number 328218 008 Vhost Sample Application i n Le i 23 7 4 June 2014 Automatically open file descriptors for each virtio net device and inserts this into the command line parameters Disables offloads on each virtio net device Calls Qemu passing both the command line parameters passed to the script itself and those it has auto detected The QEMU wrapper script will automatically configure calls to QEMU user target qemu wrap py machine pc i440fx 1 4 accel kvm usb off cpu SandyBridge smp 4 sockets 4 cores 1 threads 1 netdev tap id hostnetl vhost on device virtio net pci netdev hostnetl id netl1 hda disk img m 4096 which will become the following call to QEMU usr local bin qemu system x86 64 machine pc i440fx 1 4 accel kvm usb off cpu SandyBridge smp 4 sockets 4 cores 1 threads 1 netdev tap id hostnetl vhost on vhostfd open fd device virtio net pci netdev hostnetl id netl csum off gso off guest tso4 o
41. To fully understand this code it is recommended to study the chapters that relate to the Poll Mode Driver in the Intel DPDK Getting Started Guide and the Intel DPDK API Reference 20 4 1 2 Shared Variables Setup The quota and low watermark shared variables are put into an rte memzone using a call to setup shared variables void setup shared variables void const struct rte memzone qw memzone qw memzone rte memzone reserve QUOTA WATERMARK MEMZONE NAME 2 sizeof int rte socket id RTE MEMZONE 2MB if qw memzone NULL rte exit EXIT FAILURE s n rte strerror rte errno quota qw memzone addr low watermark unsigned int qw memzone addr sizeof int Intel Data Plane Development Kit Intel DPDK June 2014 Sample Applications User Guide Document Number 328218 008 125 n tel Quota and Watermark Sample Application 20 4 1 3 20 4 1 4 20 4 1 5 These two variables are initialized to a default value in MAIN and can be changed while qw is running using the qwct1 control program Application Arguments The qw application only takes one argument a port mask that specifies which ports should be used by the application At least two ports are needed to run the application and there should be an even number of ports given in the port mask The port mask parsing is done in parse qw args defined in args c Mbuf Pool Initialization Once the applic
42. Wrapper Script iiic eoe enesec ee etas rependit rennen edie de i 148 22 7 4 Libvirt Integration eire einn sehn RR nin e pnm nnn RE nO a aiana 149 22 7 5 COMMON Issues eese tied estre man te des ursk inen IRE FUNT E E unas 150 22 8 Running Intel DPDK in the Virtual Machine esee 150 22 8 1 Testpmd MAC Forwarding esses 151 22 8 2 RUNNING Testpmmd essere rre deri m dre E EEE E EETA 151 22 9 Passing Traffic to the Virtual Machine Device seseseseeemI 152 Netmap Compatibility Sample Application sesssss eee eee nets ee eeeeeaes 153 23 1 Introd ctioh iuis eite erepta eR RR eR ORDRES RATE NDRERA E RRR RA aaa aaia 153 23 2 Available APIS 5 seii eant aine a E us eke sa ce Re RR rH ERE Ra REPE RE 153 2323 CAVCALES Rm 153 23 4 Porting Netmap ApplicationS ccccece cece cece eect Hee 154 23 5 Compiling the bridge Sample Application cccceceeeeeeeee eee ee ee eeeeeeeees 155 23 6 Running the bridge Sample Application ccseeeeeee cece ee ee ee ee eee eeeeeeeeas 155 Internet Protocol IP Pipeline Sample Application eeeeee enna 156 24 1 OVEPVIEW E a aaia a aaa aa 156 24 2 Compiling the Application cccceeeee cece center ee eee eee 156 24 3 Running the Sample Code cccceceeeseccedeceeeceeetecseseaeeseeeeereenensseceeeaenenns 156 Test Pipeline Application oir nre aan ente ten hn su ced sade saa sisi teed h
43. called struct lcore queue conf Intel Data Plane Development Kit Intel DPDK Sample Applications User Guide Document Number 328218 008 53 9 4 5 9 4 6 L2 Forwarding Sample Application in Real and Virtualized Environments struct lcore queue conf unsigned n rx port unsigned rx port list MAX RX QUEUE PER LCORE struct mbuf table tx mbufs L2FWD MAX PORTS rte cache aligned struct lcore queue conf lcore queue conf RTE MAX LCORE The values n rx port and rx port list are used in the main packet processing loop see Section 9 4 6 Receive Process and Transmit Packets on page 54 later in this chapter The global configuration for the RX queues is stored in a static structure static const struct rte eth rxconf rx conf rx thresh pthresh RX PTHRESH hthresh RX HTHRESH wthresh RX WTHRESH TX Queue Initialization Each Icore should be able to transmit on any port For every port a single TX queue is initialized init one TX queue on each port fflush stdout ret rte eth tx queue setup uint8 t portid 0 nb txd rt th dev socket id portid amp tx conf if ret 0 rte exit EXIT FAILURE rte eth tx queue setup err d port Su n ret unsigned portid The global configuration for TX queues is stored in a static structure static const struct rte eth txconf tx conf tx thresh pthresh T
44. classification for the IPv4 IPv6 5 tuple syntax specifically The 5 tuple syntax consist of a source IP address a destination IP address a source port a destination port and a protocol identifier The fields in the 5 tuple syntax have the following formats e Source IP address and destination IP address Each is either a 32 bit field for IPv4 or a set of 4 32 bit fields for IPv6 represented by a value and a mask length For example an IPv4 range of 192 168 1 0 to 192 168 1 255 could be represented by a value 192 168 1 0 and a mask length 24 Intel Data Plane Development Kit Intel DPDK Sample Applications User Guide Document Number 328218 008 71 i n tel L3 Forwarding with Access Control Sample Application e Source port and destination port Each is a 16 bit field represented by a lower start and a higher end For example a range of ports 0 to 8192 could be represented by lower 0 and higher 8192 e Protocol identifier An 8 bit field represented by a value and a mask that covers a range of values To verify that a value is in the range use the following expression VAL amp mask value The trick in how to represent a range with a mask and value is as follows A range can be enumerated in binary numbers with some bits that are never changed and some bits that are dynamically changed Set those bits that dynamically changed in mask and value with 0 Set those bits that never changed in the mask wit
45. code changes can be run on top of the Intel DPDK Please refer to SRTE SDK examples netmap compat bridge bridge c for an example of ported application Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 154 Document Number 328218 008 Netmap Compatibility Sample Application i n te 24 5 24 6 June 2014 Compiling the bridge Sample Application 1 Goto the example directory export RTE SDK path to rte sdk cd RTE SDK examples netmap compat 2 Setthe target a default target is used if not specified For example export RTE TARGET x86 64 native linuxapp gcc See the Intel DPDK Getting Started Guide for possible RTE TARGET values 3 Build the application make Running the bridge Sample Application The application requires a single command line option build packet ordering EAL options p PORT A p PORT B where e p INTERFACE is the number of a valid Intel DPDK port to use If a single p parameter is given the interface will send back all the traffic it receives If two p parameters are given the two interfaces form a bridge where traffic received on one interface is replicated and sent by the other interface To run the application in a linuxapp environment using port 0 and 2 issue the following command build packet ordering EAL options p 0 p2 Refer to the Intel DPDK Getting Started Guide fo
46. core application and thus be able to access its rings To run the application in a linuxapp environment on logical core 0 issue the following command qwctl build qwctl c 1 n 4 proc type secondary Refer to the Intel DPDK Getting Started Guide for general information on running applications and the Environment Abstraction Layer EAL options qwctl is an interactive command line that let the user change variables in a running instance of qw The help command gives a list of available commands qwctl help 20 4 Code Overview The following sections provide a quick guide to the application s source code Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 124 Document Number 328218 008 Quota and Watermark Sample Application n tel 20 4 1 Core Application qw 20 4 1 1 EAL and Drivers Setup The EAL arguments are parsed at the beginning of the MAIN function ret rte eal init argc argv if ret lt 0 rte exit EXIT FAILURE Cannot initialize EAL n argo ret argv ret Then a call to init dpdk defined in init c is made to initialize the poll mode drivers void init dpdk void int ret Bind the drivers to usable devices ret rte eal pci probe if ret 0 rte exit EXIT FAILURE rte eal pci probe error Sd n ret if rte eth dev count 2 rte exit EXIT FAILURE Not enough ethernet port available n
47. duplex ETH LINK FULL DUPLEX full duplex half duplex else printf Port d Link Down n n port id This function is called when a link status interrupt is present for the right port The port id indicates which port the interrupt applies to The type parameter identifies the interrupt event type which currently can be RTE ETH EVENT INTR LSC only but other types can be added in the future The param parameter is the address of the parameter for the callback This function should be implemented with care since it will be called in the interrupt host thread which is different from the main thread of its caller The application registers the 15i event callback and a NULL parameter to the link status interrupt event on each port rte eth dev callback register uint8 t portid RTE ETH EVENT INTR LSC lsi event callback NULL This registration can be done only after calling the rte eth dev configure function and before calling any other function If 1sc is initialized with O the callback is never called since no interrupt event would ever be present Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 84 Document Number 328218 008 Link Status Interrupt Sample Application i n tel 14 4 5 RX Queue Initialization The application uses one Icore to poll one or several ports depending on t
48. for both IPv4 and IPv6 packet forwarding for any number of burst packets received and the packet forwarding decision that is the identification of the output interface for the packet for hash based lookups is done by the get ipv4 dst port Orget ipv6 dst port function The get ipv4 dst port function is shown below static inline uint8 t get ipv4 dst port void ipv4 hdr uint8 t portid lookup struct t ipv4 l3fwd lookup struct int ret 0 union ipv4 5tuple host key ipv4 hdr uint8 t ipv4 hdr offsetof struct ipv4 hdr time to live m128i data mm loadu sil128 m128i ipv4 hdr Get 5 tuple dst port src port dst IP address src IP address and protocol key xmm mm and sil28 data mask0 Find destination port ret rte hash lookup ipv4 l13fwd lookup struct const void amp key return uint8 t ret lt 0 portid ipv4 l13fwd out if ret The get ipv6 dst port function is similar to the get ipv4 dst port function The simple ipv4 fwd 4pkts and simple ipv6 fwd 4pkts function are optimized for continuous 4 valid ipv4 and ipv6 packets they leverage the multiple buffer optimization to boost the performance of forwarding packets with the exact match on hash table The key code snippet of simple ipv4 fwd 4pkts is shown below static inline void simple ipv4 fwd 4pkts struct rte mbuf m 4 uint8 t portid struct lcore conf qconf Intel Data Plane Devel
49. format source IP address destination IP address L4 protocol L4 protocol source port L4 protocol destination port The firewall pipeline implements the rule database using an ACL table The routing pipeline implements an IP routing table by using an LPM IPv4 table and an ARP table by using a hash table with an 8 byte key size The IP routing table lookup provides the output interface ID and the next hop IP address which are stored in the packet meta data then used as the lookup key into the ARP table The ARP table lookup provides the destination MAC address to be used for the output packet The action for the default entry of both the IP routing table and the ARP table is packet drop The following CLI operations are available e Enable disable NIC ports RX pipeline e Add delete list flows flow classification pipeline e Add delete list firewall rules firewall pipeline e Add delete list routes routing pipeline e Add delete list ARP entries routing pipeline In addition there are two special commands e flow add all Populate the flow classification table with 16 million flows by iterating through the last three bytes of the destination IP address These flows are not displayed when using the flow print command When this command is used the following traffic profile must be used to have flow table lookup hits for all input packets TCP IPv4 packets with destination IP address A B C D with A fixed to 0 and
50. forwarding sample application see Chapter 10 L3 Forwarding Sample Application for more information The main difference from the L3 Forwarding sample application is that this application introduces power aware optimization algorithms by leveraging the Power library to control P state and C state of processor based on packet load The Intel DPDK includes poll mode drivers to configure Intel NIC devices and their receive Rx and transmit Tx queues The design principle of this PMD is to access the Rx and Tx descriptors directly without any interrupts to quickly receive process and deliver packets in the user space In general the Intel DPDK executes an endless packet processing loop on dedicated IA cores that include the following steps e Retrieve input packets through the PMD to poll Rx queue e Process each received packet or provide received packets to other processing cores through software queues e Send pending output packets to Tx queue through the PMD In this way the PMD achieves better performance than a traditional interrupt mode driver at the cost of keeping cores active and running at the highest frequency hence consuming the maximum power all the time However during the period of processing light network traffic which happens regularly in communication infrastructure systems due to well known tidal effect the PMD is still busy waiting for network packets which wastes a lot of power Intel Data Plane Developme
51. global variable ipv6 can be specified with ipv6 in the command line Value of global variable hash entry number which is used to specify the total hash entry number for all used ports in hash performance test can be specified with hash entry num VALUE in command line being its default value 4 dif APP LOOKUP METHOD APP LOOKUP EXACT MATCH static void setup hash int socketid Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 60 Document Number 328218 008 L3 Forwarding Sample Application i n tel if hash entry number HASH ENTRY NUMBER DEFAULT if ipv6 0 populate the ipv4 hash populate ipv4 many flow into table ipv4 l3fwd lookup struct socketid hash entry number else populate the ipv6 hash populate ipv6 many flow into table ipv6 l3fwd lookup struct socketid hash entry number else if ipv6 0 populate the ipv4 hash populate ipv4 few flow into table ipv4 13fwd lookup struct socketid else populate the ipv6 hash populate ipv6 few flow into table ipv6 13fwd lookup struct sock etid fendif 10 4 2 LPM Initialization The LPM object is created and loaded with the pre configured entries read from a global array dif APP LOOKUP METHOD APP LOOKUP LPM static void setup lpm int socketid unsigned i int ret char s 64 create the LPM table
52. has to be processed by a worker Icore on socket B the packet is sent to socket B through software rings The packet is transmitted by a NIC port connected to the same CPU socket as the worker Icore that processed it ABC The packet is received on socket A it is processed by an Icore on socket B then it has to be transmitted out by a NIC connected to socket C The performance price for crossing the CPU socket boundary is paid twice for this packet 88 Intel Data Plane Development Kit Intel DPDK Sample Applications User Guide Document Number 328218 008 93 n tel Multi process Sample Application 16 16 1 16 1 1 Multi process Sample Application This chapter describes the example applications for multi processing that are included in the Intel DPDK Example Applications Building the Sample Applications The multi process example applications are built in the same way as other sample applications and as documented in the Intel DPDK Getting Started Guide To build all the example applications 1 SetRTE SDK and go to the example directory export RTE SDK path to rte sdk cd RTE_ SDK examples multi_ process 2 Setthe target a default target will be used if not specified For example export RTE TARGET x86 64 native linuxapp gcc See the Intel DPDK Getting Started Guide for possible RTE TARGET values 3 Build the applications make Note If just a specific multi process application n
53. initialize all the resources such as memory CPU cores driver ports and so on as the other examples do Thereafter it creates slave processes as shown in the following figure Figure 9 Master slave Process Workflow June 2014 2 retum 1 core_launch 2 Return to entry and wait Thread 1 Thread 2 Intel Data Plane Development Kit Intel DPDK Sample Applications User Guide Document Number 328218 008 101 i n tel Multi process Sample Application The master process calls the rte eal mp remote launch EAL function to launch an application function for each pinned thread through the pipe Then it waits to check if any slave processes have exited If so the process tries to re initialize the resources that belong to that slave and launch them in the pinned thread entry again The following section describes the recovery procedures in more detail For each pinned thread in EAL after reading any data from the pipe it tries to call the function that the application specified In this master specified function a ork call creates a slave process that performs the L2 forwarding task Then the function waits until the slave exits is killed or crashes Thereafter it notifies the master of this event and returns Finally the EAL pinned thread waits until the new function is launched After discussing the master slave model it is necessary to mention another issue global and sta
54. nb ports 0 rte exit EXIT FAILURE No Ethernet ports bye n if nb ports RTE MAX ETHPORTS nb ports RTE MAX ETHPORTS reset l2fwd dst ports for portid 0 portid lt RTE MAX ETHPORTS portid 12fwd dst ports portid 0 last port 0 for portid 0 portid lt nb ports portid skip ports that are not enabled if 12fwd enabled port mask amp 1 portid 0 continue Each logical core is assigned a dedicated TX queue on each port if nb ports in mask 2 l2fwd dst ports portid last port l2fwd dst ports last port portid else last port portid nb ports in mask t rte eth dev info get uint8 t portid amp dev info Observe that Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 52 Document Number 328218 008 L2 Forwarding Sample Application in Real and Virtualized Environments i n tel 9 4 4 June 2014 e rte igb pmd init all simultaneously registers the driver as a PCI driver and as an Ethernet Poll Mode Driver e rte eal pci probe parses the devices on the PCI bus and initializes recognized devices The next step is to configure the RX and TX queues For each port there is only one RX queue only one Icore is able to poll a given port The number of TX queues depends on the number of available Icores The rte eth dev confi
55. of when trying to use the compat netmap library the most important of which are listed below Additional caveats are presented in the RTE SDK examples netmap compat README md file These can change as the library is updated e Any system call that can potentially affect file descriptors cannot be used with a descriptor returned by the rte netmap open function Intel Data Plane Development Kit Intel DPDK Sample Applications User Guide Document Number 328218 008 153 i n tel i Netmap Compatibility Sample Application 24 4 Note that e rte netmap mmap merely returns the address of a Intel DPDK memzone The address length flags offset and so on arguments are therefore ignored completely e rte netmap poll only supports infinite negative or zero time outs It effectively turns calls to the po11 system call made in a Netmap application into polling of the Intel DPDK ports changing the semantics of the usual POSIX defined poll e Not all of Netmap s features are supported host rings slot flags and so on are not supported or are simply not relevant in the Intel DPDK model e The Netmap manual page states that a device obtained through dev netmap also supports the iocti supported by network devices It is not the case with this compatibility layer e The Netmap kernel module exposes a sysfs interface to change some internal parameters such as the size of the shared memory region This in
56. on information read from the input packet The lookup method is either hash based or LPM based and is selected at compile time When the selected lookup method is hash based a hash object is used to emulate the flow classification stage The hash object is used in correlation with the flow table to map each input packet to its flow at runtime The hash lookup key is represented by the DiffServ 5 tuple composed of the following fields read from the input packet Source IP Address Destination IP Address Protocol Source Port and Destination Port The ID of the output interface for the input packet is read from the identified flow table entry The set of flows used by the application is statically configured and loaded into the hash at initialization time When the selected lookup method is LPM based an LPM object is used to emulate the forwarding stage for IPv4 packets The LPM object is used as the routing table to identify the next hop for each input packet at runtime The LPM lookup key is represented by the Destination IP Address field read from the input packet The ID of the output interface for the input packet is the next hop returned by the LPM lookup The set of LPM rules used by the application is statically configured and loaded into the LPM object at the initialization time Note Please refer to Section 9 1 1 Virtual Function Setup Instructions on page 50 for virtualized test case setup Intel Data Plane Development Kit Intel
57. outgoing packet and although both are based on the data zero copy idea there are some differences in the detail The first approach creates a clone of the input packet for example walk though all segments of the input packet and for each of segment create a new buffer and attach that new buffer to the segment refer to rte pktmbu clone in the rte mbuf library for more details A new buffer is then allocated for the packet header and is prepended to the cloned buffer Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 30 Document Number 328218 008 IPv4 Multicast Sample Application i n tel The second approach does not make a clone it just increments the reference counter for all input packet segment allocates a new buffer for the packet header and prepends it to the input packet Basically the first approach reuses only the input packet s data but creates its own copy of packet s metadata The second approach reuses both input packet s data and metadata The advantage of first approach is that each outgoing packet has its own copy of the metadata so we can safely modify the data pointer of the input packet That allows us to skip creation if the output packet is for the last destination port and instead modify input packet s header in place For example for N destination ports we need to invoke mcast out pkt N 1 times The advantage of the second approach is that there is less
58. performing the application processing worker Icores The worker Icores are totally oblivious to the intricacies of the packet I O activity and use the NIC agnostic interface provided by software rings to exchange packets with the I O cores Overview Figure 6 Load Balancer Application Architecture June 2014 The architecture of the Load Balance application is presented in the following figure For the sake of simplicity the diagram illustrates a specific case of two I O RX and two I O TX Icores off loading the packet I O overhead incurred by four NIC ports from four worker cores with each I O Icore handling RX TX for two NIC ports Intel Data Plane Development Kit Intel DPDK Sample Applications User Guide Document Number 328218 008 89 i n tel Load Balancer Sample Application 15 1 1 I O RX Logical Cores Each I O RX Icore performs packet RX from its assigned NIC RX rings and then distributes the received packets to the worker threads The application allows each I O RX Icore to communicate with any of the worker threads therefore each I O RX Icore worker lcore pair is connected through a dedicated single producer single consumer software ring The worker Icore to handle the current packet is determined by reading a predefined 1 byte field from the input packet worker id packet load balancing field n workers Since all the packets that are part of the same traffic flow are expected to ha
59. rings connecting I O Icores and worker Icores is built by the application as a result of the application configuration provided by the user through the application command line parameters A specific Icore performing the I O RX role for a specific set of NIC ports can also perform the I O TX role for the same or a different set of NIC ports A specific Icore cannot perform both the I O role either RX or TX and the worker role during the same session Example load balancer c Oxf8 n 4 rx 0 0 3 1 0 3 tx 0 3 1 3 w 4 5 6 7 lpm 1 0 0 0 24 gt 0 1 0 1 0 24 gt 1 pos lb 29 There is a single I O Icore Icore 3 that handles RX and TX for two NIC ports ports 0 and 1 that handles packets to from four worker Icores Icores 4 5 6 and 7 that are assigned worker IDs 0 to 3 worker ID for Icore 4 is 0 for Icore 5 is 1 for Icore 6 is 2 and for Icore 7 is 3 Assuming that all the input packets are IPv4 packets with no VLAN label and the source IP address of the current packet is A B C D the worker Icore for the current packet is determined by byte D which is byte 29 There are two LPM rules that are used by each worker Icore to route packets to the output NIC ports The following table illustrates the packet flow through the system for several possible traffic flows Flow Source Destination Worker ID Output IP Address IP Address Worker Icore NIC Port i 0 0 0 0 1 0 0 1 0 4 0 2 0 0 0 1 1 0
60. te I 4 1 4 2 4 3 4 3 1 June 2014 Hello World Sample Application The Hello World sample application is an example of the simplest Intel DPDK application that can be written The application simply prints an helloworld message on every enabled Icore Compiling the Application 1 Go to the example directory export RTE SDK path to rte sdk cd RTE_SDK examples helloworld 2 Set the target a default target is used if not specified For example export RTE TARGET x86 64 native linuxapp gcc See the Intel DPDK Getting Started Guide for possible RTE TARGET values 3 Build the application make Running the Application To run the example in a 1inuxapp environment build helloworld c f n 4 Refer to Intel DPDK Getting Started Guide for general information on running applications and the Environment Abstraction Layer EAL options Explanation The following sections provide some explanation of code EAL Initialization The first task is to initialize the Environment Abstraction Layer EAL This is done in the main function using the following code int MAIN int argc char argv ret rte eal init argc argv if ret lt 0 rte panic Cannot init EAL n Intel Data Plane Development Kit Intel DPDK Sample Applications User Guide Document Number 328218 008 21 4 3 2 Hello World Sample Application This call finishes the initialization process that was started befo
61. the IPv4 destination address for any given packet 6 2 Building the Application To compile the application 1 Goto the sample application directory export RTE SDK path to rte sdk cd RTE_SDK examples ipv4 multicast 2 Setthe target a default target is used if not specified For example export RTE TARGET x86 64 native linuxapp gcc See the Intel DPDK Getting Started Guide for possible RTE TARGET values 3 Build the application make Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 26 Document Number 328218 008 IPv4 Multicast Sample Application i n te l Note The compiled application is written to the build subdirectory To have the application written to a different location the 0 path to build directory option may be specified in the make command 6 3 Running the Application The application has a number of command line options build ipv4 multicast EAL options p PORTMASK q NQ where e p PORTMASK Hexadecimal bitmask of ports to configure e q NQ determines the number of queues per Icore Note Unlike the basic L2 L3 Forwarding sample applications NUMA support is not provided in the IPv4 Multicast sample application Typically to run the IPv4 Multicast sample application issue the following command as root build ipv4_multicast c 0x00f n 3 p 0x3 q 1 In this command e The c option enables cores 0 1 2 and 3 e The n opt
62. the command is parsed an opaque pointer a help string and a list of tokens in a NULL terminated table The rte cmdline application provides a list of pre defined token types e String Token Match a static string a list of static strings or any string e Number Token Match a number that can be signed or unsigned from 8 bit to 32 bit e IP Address Token Match an IPv4 or IPv6 address or network e Ethernet Address Token Match a MAC address In this example a new token type obj list is defined and implemented in the parse obj list c and parse obj list h files For example the cmd obj del show command is defined as shown below struct cmd obj add result cmdline fixed string t action cmdline fixed string t name struct object obj be static void cmd obj del show parsed void parsed result struct cmdline cl attribute unused void data JE aia ES cmdline parse token string t cmd obj action TOKEN STRING INITIALIZER struct cmd obj del show result action show del parse token obj list t cmd obj obj TOKEN OBJ LIST INITIALIZER struct cmd obj del show result amp global obj list cmdline parse inst t cmd obj del show f cmd obj del show parsed function to call data NULL 2nd arg of func help str Show del an object tokens token list NULL terminated void amp cmd obj action void amp cmd obj obj NULL
63. the named ring structure used as send ring in the primary process is the recv ring in the secondary process Once the rings and memory pools are all available in both the primary and secondary processes the application simply dedicates two threads to sending and receiving messages respectively The receive thread simply dequeues any messages on the receive ring prints them and frees the buffer space used by the messages back to the memory pool The send thread makes use of the command prompt library to interactively request user input for messages to send Once a send command is issued by the user a buffer is allocated from the memory pool filled in with the message contents then enqueued on the appropriate rte ring Symmetric Multi process Example The second example of Intel DPDK multi process support demonstrates how a set of processes can run in parallel with each process performing the same set of packet processing operations Since each process is identical in functionality to the others we refer to this as symmetric multi processing to differentiate it from asymmetric multi processing such as a client server mode of operation seen in the next example where different processes perform different tasks yet co operate to form a packet processing system The following diagram shows the data flow through the application using two processes Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014
64. to the Ethernet address of the port are accepted By default promiscuous mode is enabled so that packets are accepted regardless of the packet s Ethernet MAC destination address e config port queue lcore port queue lcore determines which queues from which ports are mapped to which cores Refer to Chapter 10 L3 Forwarding Sample Application on page 58 for more detailed descriptions of the config command line option As an example to run the application with two ports and two cores which are using different Intel QuickAssist Technology execution engines performing AES CBC 128 encryption with AES XCBC MAC 96 hash the following settings can be used e Traffic generator source IP address 0 9 6 1 e Command line build dpdk qat c Oxff n 2 p 0x3 config 0 0 1 1 0 2 Refer to the Intel DPDK Test Report for more examples of traffic generator setup and the application startup command lines If no errors are generated in response to the startup commands the application is running correctly 88 Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 120 Document Number 328218 008 Quota and Watermark Sample Application i n te 20 20 1 June 2014 Quota and Watermark Sample Application The Quota and Watermark sample application is a simple example of packet processing using Intel Data Plane Development Kit Intel DPDK that showcases the use of a quota
65. work to be done for each outgoing packet that is the clone operation is skipped completely However there is a price to pay The input packet s metadata must remain intact so for N destination ports we need to invoke mcast out pkt N times Therefore for a small number of outgoing ports and segments in the input packet first approach is faster As the number of outgoing ports and or input segments grows the second approach becomes more preferable Depending on the number of segments or the number of ports in the outgoing portmask either the first with cloning or the second without cloning approach is taken use clone port num MCAST CLONE PORTS amp amp m gt pkt nb segs lt MCAST CLONE SEGS It is the mcast out pkt function that performs the packet duplication either with or without actually cloning the buffers static inline struct rte mbuf mcast out pkt struct rte mbuf pkt int use clone struct rte mbuf hdr Create new mbuf for the header if unlikely hdr rte pktmbuf alloc header pool NULL return NULL If requested then make a new clone packet if use clone 0 amp amp unlikely pkt rte pktmbuf clone pkt clone pool NULL rte pktmbuf free hdr return NULL prepend new header hdr gt pkt next pkt update header s fields hdr gt pkt pkt_len uintl6 t hdr gt pkt data_len pkt gt pkt pkt_len
66. 0 0 16 3 IPv4 100 50 0 0 16 4 IPv4 100 60 0 0 16 5 IPv4 100 70 0 0 16 6 IPv4 100 80 0 0 16 7 The default 13fwd ipv6 route array table is struct l3fwd ipv6 route l13fwd ipv6 route array L Ll I Ly 1y l 1 1l dy l 1y 1 l1 1 r 1 48 0 2 Ly r Lg te dr y ter tr tr tr tr tr te 1 48 1 3 Ly r te ter Lg p tr tr tr tr tr tr te 1 48 2 4 1 1 1 1 1 1 1 1 1 1 1 1 1 1 48 3 Dy Ly 1 1 l 1 1 l 1 1 l1 Dy l 1 1 48 4 6 ty ry tr tr ter tr tr trp tr tr tr tr oli r 1 48 Shy 7 1 1 1 1 1 1 1 1 1 1 1 1 1 1 48 6 8 Ly tp te ter Lg tr ter tr ep tr tr tr te 1 48 Thy be For example for the input IPv4 packet with destination address 100 10 1 1 and packet length 9198 bytes seven IPv4 packets will be sent out from port 0 to the destination address 100 10 1 1 six of those packets will have length 1500 bytes and one packet will have length 318 bytes IP Fragmentation sample application provides basic NUMA support in that all the memory structures are allocated on all sockets that have active Icores on them Refer to the Intel DPDK Getting Started Guide for general information on running applications and the Environment Abstraction Layer EAL options 88 Intel Data Plane Development Kit Intel DPDK June 2014 Sample Applications User Guide Document Numbe
67. 0 i lt ETH DCB NUM USER PRIORITIES itt conf dcb queue i uint8 t i NUM QUEUES num pools void rte memcpy eth conf amp vmdq dcb conf default sizeof eth conf void rte memcpy amp eth conf rx adv conf vmdq dcb conf amp conf sizeof eth conf rx adv conf vmdq dcb conf return 0 Once the network port has been initialized using the correct VMDQ and DCB values the initialization of the port s RX and TX hardware rings is performed similarly to that in the L2 Forwarding sample application See Chapter 9 L2 Forwarding Sample Application in Real and Virtualized Environments for more information Intel Data Plane Development Kit Intel DPDK June 2014 Sample Applications User Guide Document Number 328218 008 137 i n tel VMDQ and DCB Forwarding Sample Application 22 4 2 Statistics Display When run in a linuxapp environment the VMDQ and DCB Forwarding sample application can display statistics showing the number of packets read from each RX queue This is provided by way of a signal handler for the srcHUP signal which simply prints to standard output the packet counts in grid form Each row of the output is a single pool with the columns being the queue number within that pool To generate the statistics output use the following command user host sudo killall HUP vmdq dcb app Please note that the statistics output will appear on the terminal where the vmdq dc
68. 13fwd Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 58 Document Number 328218 008 L3 Forwarding Sample Application i n te Set the target a default target is used if not specified For example export RTE TARGET x86 64 native linuxapp gcc See the Intel DPDK Getting Started Guide for possible RTE TARGET values Build the application make 10 3 Running the Application The application has a number of command line options build 13fwd EAL options p PORTMASK P config port queue lcore port queue lcore enable jumbo max pkt len PKTLEN no numa hash entry num ipv6 where p PORTMASK Hexadecimal bitmask of ports to configure P optinal sets all ports to promiscuous mode so that packets are accepted regardless of the packet s Ethernet MAC destination address Without this option only packets with the Ethernet MAC destination address set to the Ethernet address of the port are accepted config port queue lcore port queue lcore determines which queues from which ports are mapped to which cores enable jumbo optional enables jumbo frames max pkt len optional maximum packet length in decimal 64 9600 no numa optional disables numa awareness hash entry num optional specifies the hash entry number in hexadecimal to be setup ipv6 optional set it if running ip
69. 2014 Intel Data Plane Development Kit Intel DPDK Sample Applications User Guide Document Number 328218 008 95 i n tel Multi process Sample Application Note If the primary instance is terminated the secondary instance must also be shut down and restarted after the primary This is necessary because the primary instance will clear and reset the shared memory regions on startup invalidating the secondary process s pointers The secondary process can be stopped and restarted without affecting the primary process 16 1 2 2 16 1 3 How the Application Works The core of this example application is based on using two queues and a single memory pool in shared memory These three objects are created at startup by the primary process since the secondary process cannot create objects in memory as it cannot reserve memory zones and the secondary process then uses lookup functions to attach to these objects as it starts up if rte eal process type RTE PROC PRIMARY send ring rte ring create PRI 2 SEC ring size SOCKETO flags recv ring rte ring create SEC 2 PRI ring size SOCKETO flags message pool rte mempool create MSG POOL pool size string size pool cache priv data sz NULL NULL NULL NULL SOCKETO flags else recv ring rte ring lookup PRI 2 SEC send ring rte ring lookup SEC 2 PRI message pool rte mempool lookup MSG POOL Note however that
70. Applications User Guide Document Number 328218 008 145 i n tel Vhost Sample Application user target build app vhost switch c f n 4 huge dir mnt huge p 0x1 dev basename usvhost dev index 1 Note Please note the huge dir parameter instructs the Intel DPDK to allocate its memory from the 2 MB page hugetlbfs 23 6 1 Parameters Basename and Index The Intel DPDK vhost net sample code uses a Linux character device to communicate with QEMU The basename and the index are used to generate the character devices name dev lt basename gt lt index gt The index parameter is provided for a situation where multiple instances of the virtual switch is required For compatibility with the QEMU wrapper script a base name of usvhost and an index of 1 should be used user target build app vhost switch c f n 4 huge dir mnt huge p 0x1 dev basename usvhost dev index 1 vm2vm The vm2vm parameter disable set mode of packet switching between guests in the host Value of 0 means disabling vm2vm implies that on virtual machine packet transmission will always go to the Ethernet port Value of 1 means software mode packet forwarding between guests it needs packets copy in vHOST so valid only in one copy implementation and invalid for zero copy implementation value of 2 means hardware mode packet forwarding between guests it allows packets go to the Ethernet port hardware L2 switch wi
71. Basic Multi process Example sssssssee mmn 94 15 1 2 1 Running the Application eee 94 15 1 2 2 How the Application WOrkS ssssssssssrssrssrnsrrrrrnerrrrrnns 96 15 1 3 Symmetric Multi process Example ssseses enn 96 15 1 3 1 Running the Application seen 97 15 1 3 2 How the Application Works eeeeene 98 15 1 4 Client Server Multi process Example sss 99 15 1 4 1 Running the Application seen 99 15 1 4 2 How the Application Works eeeeenneee 100 15 1 5 Master slave Multi process Example essen 101 15 1 5 1 Master slave Process Models sss 101 15 1 5 2 Slave Process Recovery Mechanism ssssss 102 15 1 5 3 Floating Process Support sssesses 103 15 1 5 4 Run the Application eere tnn nnn 104 15 1 5 5 Explanation ii estie a eme rta erbe e an 105 QoS Metering Sample Application esses eene 108 KcnMEEES I E 108 16 2 Compiling the Application ioi oii eret ea e n etu ca ctn nk bea nina 108 16 3 Running the Application reise enne tnrba hne he kn bane nasa reci k rne seni ra Ra a 109 16 4 Explanatio M m 109 QoS Scheduler Sample Application cccceeee cece ee ee ee ee ee eee eee eee eee ea eens 111 I7 OVEVIEW UEM 111 Intel Data Plane Development Kit Intel DPDK Sample Applic
72. DQ and DCB Forwarding Sample Application 22 22 1 VMDQ and DCB Forwarding Sample Application The VMDQ and DCB Forwarding sample application is a simple example of packet processing using the Intel DPDK The application performs L2 forwarding using VMDQ and DCB to divide the incoming traffic into 128 queues The traffic splitting is performed in hardware by the VMDQ and DCB features of the Intel 82599 10 Gigabit Ethernet Controller Overview This sample application can be used as a starting point for developing a new application that is based on the Intel DPDK and uses VMDQ and DCB for traffic partitioning The VMDQ and DCB filters work on VLAN traffic to divide the traffic into 128 input queues on the basis of the VLAN ID field and VLAN user priority field VMDQ filters split the traffic into 16 or 32 groups based on the VLAN ID Then DCB places each packet into one of either 4 or 8 queues within that group based upon the VLAN user priority field In either case 16 groups of 8 queues or 32 groups of 4 queues the traffic can be split into 128 hardware queues on the NIC each of which can be polled individually by an Intel DPDK application All traffic is read from a single incoming port port 0 and output on port 1 without any processing being performed The traffic is split into 128 queues on input where each thread of the application reads from multiple queues For example when run with 8 threads that is with the
73. Guide Document Number 328218 008 55 n tel L2 Forwarding Sample Application in Real and Virtualized Environments The 12fwd send packet function stores the packet in a per Icore and per txport table If the table is full the whole packets table is transmitted using the 12fwd send burst function Send the packet on an output interface static int l2fwd send packet struct rte mbuf m uint8 t port unsigned lcore id len struct lcore queue conf qconf lcore id rte lcore id qconf amp lcore queue conf lcore id len qconf tx mbufs port len qconf tx mbufs port m table len m len enough pkts to be sent if unlikely len MAX PKT BURST l2fwd send burst qconf MAX PKT BURST port len 0 qconf tx mbufs port len len return 0 To ensure that no packets remain in the tables each Icore does a draining of TX queue in its main loop This technique introduces some latency when there are not many packets to send however it improves performance cur tsc rte rdtsc TX burst queue drain A diff tsc cur tsc prev tsc if unlikely diff tsc gt drain tsc for portid 0 portid lt RTE MAX ETHPORTS portid if qconf tx mbufs portid len 0 continue l2fwd send burst amp lcore queue conf lcore id qconf tx mbufs portid len uint8 t portid qconf tx mbufs portid len 0 if timer is enabled
74. Intel Data Plane Development Kit Intel DPDK Sample Applications User Guide June 2014 Document Number 328218 008 n tel Introduction INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS NO LICENSE EXPRESS OR IMPLIED BY ESTOPPEL OR OTHERWISE TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT EXCEPT AS PROVIDED IN INTEL S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY RELATING TO SALE AND OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE MERCHANTABILITY OR INFRINGEMENT OF ANY PATENT COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT A Mission Critical Application is any application in which failure of the Intel Product could result directly or indirectly in personal injury or death SHOULD YOU PURCHASE OR USE INTEL S PRODUCTS FOR ANY SUCH MISSION CRITICAL APPLICATION YOU SHALL INDEMNIFY AND HOLD INTEL AND ITS SUBSIDIARIES SUBCONTRACTORS AND AFFILIATES AND THE DIRECTORS OFFICERS AND EMPLOYEES OF EACH HARMLESS AGAINST ALL CLAIMS COSTS DAMAGES AND EXPENSES AND REASONABLE ATTORNEYS FEES ARISING OUT OF DIRECTLY OR INDIRECTLY ANY CLAIM OF PRODUCT LIABILITY PERSONAL INJURY OR DEATH ARISING IN ANY WAY OUT OF SUCH MISSION CRITICAL APPLICATION WHETHER OR NOT INTEL OR ITS SUBCONTRACTOR WAS NEGLIGENT IN THE DESIGN MANUFACTURE OR WARNING OF THE INTE
75. Intel DPDK Sample Applications User Guide Document Number 328218 008 37 i n tel Kernel NIC Interface Sample Application 8 8 1 Kernel NIC Interface Sample Application The Kernel NIC Interface KNI is an Intel DPDK control plane solution that allows userspace applications to exchange packets with the kernel networking stack To accomplish this Intel DPDK userspace applications use an IOCTL call to request the creation of a KNI virtual device in the Linux kernel The IOCTL call provides interface information and the Intel DPDK s physical address space which is re mapped into the kernel address space by the KNI kernel loadable module that saves the information to a virtual device context The Intel DPDK creates FIFO queues for packet ingress and egress to the kernel module for each device allocated The KNI kernel loadable module is a standard net driver which upon receiving the IOCTL call access the Intel DPDK s FIFO queue to receive transmit packets from to the Intel DPDK userspace application The FIFO queues contain pointers to data packets in the Intel DPDK This e Provides a faster mechanism to interface with the kernel net stack and eliminates system calls e Facilitates the Intel DPDK using standard Linux userspace net tools tcpdump ftp and so on e Eliminate the copy to user and copy from user operations on packets The Kernel NIC Interface sample application is a simple example that demonstrates the u
76. L PRODUCT OR ANY OF ITS PARTS Intel may make changes to specifications and product descriptions at any time without notice Designers must not rely on the absence or characteristics of any features or instructions marked reserved or undefined Intel reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from future changes to them The information here is subject to change without notice Do not finalize a design with this information The products described in this document may contain design defects or errors known as errata which may cause the product to deviate from published specifications Current characterized errata are available on request Contact your local Intel sales office or your distributor to obtain the latest specifications and before placing your product order Copies of documents which have an order number and are referenced in this document or other Intel literature may be obtained by calling 1 800 548 4725 or go to http www intel com design literature htm Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries Other names and brands may be claimed as the property of others Copyright 2014 Intel Corporation All rights reserved Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 2 Document Number 328218 008
77. M GM GN GM oM MOON GN GS GS GM oM COO 1 OY O1 4 CO h2 MON OM SM oM OON GM OS SM oM ys NNN GS OM oM ys NN GS GS oM oM ys NN GN GS GM oM ys NNN GN GM oM he For example for the fragmented input IPv4 packet with destination address 100 10 1 1 a reassembled IPv4 packet be sent out from port 0 to the destination address 100 10 1 1 once all the fragments are collected 7 4 Explanation The following sections provide some explanation of the sample application code As mentioned in the overview section the initialization and run time paths are very similar to those of the L2 forwarding application see Chapter 9 L3 Forwarding Sample Application for more information The following sections describe aspects that are specific to the IP reassemble sample application 7 4 1 IPv4 Fragment Table Initialization This application uses the rte ip frag library Please refer to Programmer s Guide for more detailed explanation of how to use this library Fragment table maintains information about already received fragments of the packet Each IP packet is uniquely Intel Data Plane Development Kit Intel DPDK June 2014 Sample Applications User Guide Document Number 328218 008 35 7 4 2 7 4 3 IP Reassembly Sample Application identified by triple lt Source IP address gt lt Destination IP address gt lt ID gt To avoid lock contention each RX queue has its own Fragment Table e g the application can
78. Started Guide for general information on running applications and the Environment Abstraction Layer EAL options Explanation The following sections provide some explanation of the code Initialization and Main Loop In addition to EAL initialization the timer subsystem must be initialized by calling the rte timer subsystem init function init EAL ret rte eal init argc argv if ret 0 rte panic Cannot init EAL n init RTE timer library rte timer subsystem init Intel Data Plane Development Kit Intel DPDK Sample Applications User Guide Document Number 328218 008 131 i n tel j Timer Sample Application After timer creation see the next paragraph the main loop is executed on each slave Icore using the well known rte eal remote launch and also on the master call lcore mainloop on every slave lcore RTE LCORE FOREACH SLAVE lcore id rte eal remote launch lcore mainloop NULL lcore id call it on master lcore too void lcore mainloop NULL The main loop is very simple in this example while 1 Call the timer handler on each core as we don t need a very precise timer so only call rte timer manage every 10ms at 2 Ghz In a real application this will enhance performances as reading the HPET timer is not efficient cur tsc rte rdtsc diff tsc cur tsc prev tsc if diff tsc gt TIMER RESOL
79. The content of the Intel DPDK the build system including the commands that can be used in the root Intel DPDK Makefile to build the development kit and an application and guidelines for porting an application Optimizations used in the software and those that should be considered for new development A glossary of terms is also provided e API Reference Provides detailed information about Intel DPDK functions data structures and other programming constructs e Sample Applications User Guide Describes a set of sample applications Each chapter describes a sample application that showcases specific functionality and provides instructions on how to compile run and use the sample application 88 Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 10 Document Number 328218 008 Command Line Sample Application i n te I 2 1 Caution Command Line Sample Application This chapter describes the Command Line sample application that is part of the Intel Data Plane Development Kit Intel DPDK Overview The Command Line sample application is a simple application that demonstrates the use of the command line interface in the Intel DPDK This application is a readline like interface that can be used to debug an Intel DPDK application in a Linux application environment The rte cmdline library should not be used in production code since it is not validated to the same standar
80. UTION CYCLES rte timer manage prev tsc cur tsc As explained in the comment it is better to use the TSC register as it is a per Icore register to check if the rte timer manage function must be called or not In this example the resolution of the timer is 10 milliseconds 21 3 2 Managing Timers In the main function the two timers are initialized This call to rte timer init is necessary before doing any other operation on the timer structure init timer structures rte timer init amp timer0 rte timer init amp timerl Then the two timers are configured e The first timer timer0 is loaded on the master Icore and expires every second Since the PERIODICAL flag is provided the timer is reloaded automatically by the timer subsystem The callback function is timer0 cb e The second timer timer1 is loaded on the next available Icore every 333 ms The SINGLE flag means that the timer expires only once and must be reloaded manually if required The callback function is timerl cb load timer0 every second on master lcore reloaded automatically hz rte get hpet hz lcore id rte lcore id rte timer reset amp timer0 hz PERIODICAL lcore id timer0 cb NULL Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 132 Document Number 328218 008 Timer Sample Application i n Le June 2014 load timerl every seco
81. Watermark Sample Application Finally the send_stage function s task is to read packets from the last ring in a pipeline and send them on the destination port defined in the port pairs array It is running on the last available logical core only lcore id rte lcore id previous lcore id get previous lcore id lcore id for port id 0 port id lt RTE MAX ETHPORTS port id if lis bit set port id portmask continue dest port id port pairs port id tx rings previous lcore id port id if rte ring empty tx continue Dequeue packets from tx and send them nb dq pkts rte ring dequeue burst tx void tx pkts quota nb tx pkts rte eth tx burst dest port id 0 tx pkts nb dq pkts For each port in the port mask up to quota packets are pulled from the last ring in its pipeline and sent on the destination port paired with the current port Control Application qwctl The qwct1 application uses the rte cmdline library to provide the user with an interactive command line that can be used to modify and inspect parameters in a running qw application Those parameters are the global quota and low watermark value as well as each ring s built in high watermark Command Definitions The available commands are defined in commands c It is advised to use the cmdline sample application user guide as a reference for everything related to the rte cmdline library Accessing Shared Varia
82. X PTHRESH hthresh TX HTHRESH wthresh TX WTHRESH tx free thresh RTE TEST TX DESC DEFAULT 1 disable feature he Receive Process and Transmit Packets In the 12 wd main loop function the main task is to read ingress packets from the RX queues This is done using the following code Read packet from RX queues x7 Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 54 Document Number 328218 008 L2 Forwarding Sample Application in Real and Virtualized Environments n tel for 1 0 i lt qconf n rx port itt portid qconf rx port list i nb rx rte eth rx burst uint8 t portid 0 pkts burst MAX PKT BURST for J 0 j lt nb rx j m pkts burst j rte prefetchO rte pktmbuf mtod m void l2fwd simple forward m portid Packets are read in a burst of size MAX PKT BURST The rte eth rx burst function writes the mbuf pointers in a local table and returns the number of available mbufs in the table Then each mbuf in the table is processed by the 12 wd simple forward function The processing is very simple process the TX port from the RX port then replace the source and destination MAC addresses Note In the following code one line for getting the output port requires some explanation During the initialization process a static array of destination ports 12fwd dst ports
83. a of 0x40000000 bytes EAL Virtual area found at 0x7 200000000 size 0x40000000 EAL check igb uio module EAL check module finished EAL Master core 0 is ready tid 54e41820 EAL Core 1 is ready tid 53b32700 Starting core 1 simple mp To run the secondary process to communicate with the primary process again run the same binary setting at least two cores in the coremask build simple mp c C n 4 proc type secondary When running a secondary process such as that shown above the proc type parameter can again be specified as auto However omitting the parameter altogether will cause the process to try and start as a primary rather than secondary process Once the process type is specified correctly the process starts up displaying largely similar status messages to the primary instance as it initializes Once again you will be presented with a command prompt Once both processes are running messages can be sent between them using the send command At any stage either process can be terminated using the quit command EAL Master core 10 is ready tid b5f89820 EAL Master core 8 is ready tid 864a3820 EAL Core 11 is ready tid 84ffe700 EAL Core 9 is ready tid 85995700 Starting core 11 Starting core 9 simple mp send hello secondary simple mp core 9 Received hello secondary simple mp core 11 Received hello primary simple mp send hello primary simple mp quit simple mp quit June
84. able the vhost net sample code to map the VM s memory into its own process address space user target qemu system x86 64 mem prealloc mem path dev hugepages The QEMU wrapper gemu wrap py is a Python script designed to automate the QEMU configuration described above It also facilitates integration with 1ibvirt although the script may also be used standalone without libvirt 23 7 1 Note 23 7 2 Note 23 7 3 Redirecting QEMU to vhost net Sample Code To redirect QEMU to the vhost net sample code implementation of the vhost net API an open file descriptor must be passed to QEMU running as a child process usr bin python fd os open dev usvhost 1 os O RDWR subprocess call qemu system x86 64 netdev tap id vhostnetO0 vhost on vhostfd fd shell True This process is automated in the QEMU wrapper script discussed in Section 23 7 3 Mapping the Virtual Machine s Memory For the Intel DPDK vhost net sample code to be run correctly QEMU must allocate the VM s memory on hugetlbfs This is done by specifying mem prealloc and mem path when executing QEMU The vhost net sample code accesses the virtio net device s virtual rings and packet buffers by finding and mapping the VM s physical memory on hugetlbfs In this case the path passed to the guest should be that of the 1 GB page hugetlbfs user target qemu system x86 64 mem prealloc mem path dev hugepages
85. ample Application User Guide June 2014 118 Document Number 328218 008 Intel QuickAssist Technology Sample Application i n tel 19 2 19 3 19 3 1 June 2014 e Intel Communications Chipset 8925 to 8955 Series Software for Linux Getting Started Guide 523128 For more details on the actual platforms used to validate this application as well as performance numbers please refer to the Test Report which is accessible by contacting your Intel representative Building the Application Steps to build the application 1 Setup the following environment variables export RTE SDK Absolute path to the Intel DPDK installation folder export ICP ROOT Absolute path to the Intel QAT installation folder 2 Setthe target a default target is used if not specified For example export RTE TARGET x86 64 native linuxapp gcc Refer to the Inte DPDK Getting Started Guide for possible RTE TARGET values 3 Build the application cd RTE SDK examples dpdk qat make Running the Application Intel QuickAssist Technology Configuration Files The Intel QuickAssist Technology configuration files used by the application are located in the con ig files folder in the application folder There following sets of configuration files are included in the Intel DPDK package e Stargo CRB single CPU socket located in the stargo folder dh89xxcc qa devO conf e Shumway CRB dual CPU socket located i
86. are configured with pipe profile 0 Pipe configuration pipe profile 0 tb rate 305175 Bytes per second tb size 1000000 Bytes tc 0 rate 305175 Bytes per second tc 1 rate 305175 Bytes per second tc 2 rate 305175 Bytes per second Intel Data Plane Development Kit Intel DPDK June 2014 Sample Applications User Guide Document Number 328218 008 113 n tel QoS Scheduler Sample Application tc 3 rate 305175 Bytes per second tc period 40 Milliseconds tc 0 oversubscription weight 1 tc 1 oversubscription weight 1 tc 2 oversubscription weight 1 tc 3 oversubscription weight 1 tc 0 wrr weights 11 11 tc 1 wrr weights 11 11 tc 2 wrr weights 11 11 tc 3 wrr weights 11 11 RED params per traffic class and color Green Yellow Red red tc 0 wred min 48 40 32 tc 0 wred max 64 64 64 tc 0 wred inv prob 10 10 10 tc 0 wred weight 9 9 9 wred min 48 40 32 wred max 64 64 64 wred inv prob 10 10 10 wred weight 9 9 9 CC CC CC CC PRPRPP wred min 48 40 32 wred max 64 64 64 wred inv prob 10 10 10 wred weight 9 9 9 CC EC CC CC NO NONO NM wred min 48 40 32 wred max 64 64 64 wred inv prob 10 10 10 wred weight 9 9 9 CC CC CC CC 18 3 1 Interactive mode These are the commands that are currently working under the command line interface e Control Commands quit Quits the application e General Statistics
87. arget build app vhost switch c f n 4 huge dir mnt huge rx retry 0 1 RX Retry Number The rx retry num option specifies the number of retries on an RX burst it takes effect only when rx retry is enabled The default value is 4 user target build app vhost switch c f n 4 huge dir mnt huge rx retry 1 rx retry num 5 RX Retry Delay Time The rx retry delay option specifies the timeout in micro seconds between retries on an RX burst it takes effect only when rx retry is enabled The default value is 15 user target build app vhost switch c f n 4 huge dir mnt huge rx retry 1 rx retry delay 20 Zero copy The zero copy option enables disables the zero copy mode for RX TX packet in the zero copy mode the packet buffer address from guest translate into host physical address and then set directly as DMA address If the zero copy mode is disabled then one copy mode is utilized in the sample This option is disabled by default user target build app vhost switch c f n 4 huge dir mnt huge zero copy 0 1 RX descriptor number The RX descriptor number option specify the Ethernet RX descriptor number Linux legacy virtio net has different behaviour in how to use the vring descriptor from DPDK based virtio net PMD the former likely allocate half for virtio header another half for frame buffer while the latter allocate all for frame buffer this lead to different number for availabl
88. arget is used if not specified For example export RTE TARGET x86 64 native linuxapp gcc See the Intel DPDK Getting Started Guide for possible RTE TARGET values 3 Build the application Intel Data Plane Development Kit Intel DPDK Sample Applications User Guide Document Number 328218 008 23 5 3 IP Fragmentation Sample Application make Running the Application The LPM object is created and loaded with the pre configured entries read from global 13fwd ipv4 route array and 3fwd_ipv6_route_array tables For each input packet the packet forwarding decision that is the identification of the output interface for the packet is taken as a result of LPM lookup If the IP packet size is greater than default output MTU then the input packet is fragmented and several fragments are sent via the output interface Application usage build ip fragmentation EAL options p PORTMASK q NQ where e p PORTMASK is a hexadecimal bitmask of ports to configure e q NQ is the number of queue ports per Icore the default is 1 To run the example in linuxapp environment with 2 Icores 2 4 over 2 ports 0 2 with 1 RX queue per Icore build ip fragmentation c 0x14 n 3 p 5 EAL coremask set to 14 EAL Detected lcore 0 on socket 0 EAL Detected lcore 1 on socket 1 EAL Detected lcore 2 on socket 0 EAL Detected lcore 3 on socket 1 EAL Detected lcore 4 on socket 0 I
89. as it performs unidirectional L2 forwarding of packets from one port to a second port No command line options are taken by this application apart from the standard EAL command line options Note Since VMD queues are being used for VMM this application works correctly when VTd is disabled in the BIOS or Linux kernel intel iommu off 22 2 Compiling the Application 1 Goto the examples directory export RTE SDK path to rte sdk cd RTE_SDK examples vmdq_dcb 2 Set the target a default target is used if not specified For example export RTE TARGET x86 64 native linuxapp gcc See the Intel DPDK Getting Started Guide for possible RTE TARGET values 3 Build the application make Intel Data Plane Development Kit Intel DPDK June 2014 Sample Applications User Guide Document Number 328218 008 135 i n tel VMDQ and DCB Forwarding Sample Application 22 3 Running the Application To run the example in a linuxapp environment user target build vmdq dcb c f n 4 p 0x3 nb pools 16 Refer to the Intel DPDK Getting Started Guide for general information on running applications and the Environment Abstraction Layer EAL options 22 4 Explanation The following sections provide some explanation of the code 22 4 1 Initialization The EAL driver and PCI configuration is performed largely as in the L2 Forwarding sample application as is the creation of the mbuf pool See Chapter 9 L2 Forwarding Sample App
90. as the maximum number of packets enqueue dequeue at a time and low and high watermarks to signal low and high ring usage respectively Additionally it shows how ring watermarks can be used to feedback congestion notifications to data producers by temporarily stopping processing overloaded rings and sending Ethernet flow control frames This sample application is split in two parts e qw The core quota and watermark sample application e quctl A command line tool to alter quota and watermarks while qw is running Overview The Quota and Watermark sample application performs forwarding for each packet that is received on a given port The destination port is the adjacent port from the enabled port mask that is if the first four ports are enabled port mask Oxf ports 0 and 1 forward into each other and ports 2 and 3 forward into each other The MAC addresses of the forwarded Ethernet frames are not affected Internally packets are pulled from the ports by the master logical core and put on a variable length processing pipeline each stage of which being connected by rings as shown in Figure 13 Intel Data Plane Development Kit Intel DPDK Sample Applications User Guide Document Number 328218 008 121 i n tel Quota and Watermark Sample Application Figure 13 Pipeline Overview loore 1 dequeues packets leor amp N dequeues packets from the master lcore s rings from laore N 1 s rings and and engueues them on its send the
91. ation n tel return 1 8 6 2 Packet Forwarding After the initialization steps are completed the main loop function is run on each Icore This function first checks the 1core id against the user provided 1core rx and lcore tx to see if this Icore is reading from or writing to kernel NIC interfaces For the case that reads from a NIC port and writes to the kernel NIC interfaces the packet reception is the same as in L2 Forwarding sample application see Section 9 4 6 Receive Process and Transmit Packets on page 54 The packet transmission is done by sending mbufs into the kernel NIC interfaces by rte kni tx burst The KNI library automatically frees the mbufs after the kernel successfully copied the mbufs Interface to burst rx and enqueue mbufs into rx q static void kni ingress struct kni port params p uint8 t i nb_kni port_id unsigned nb rx num struct rte mbuf pkts burst PKT BURST 82 if p NULL return nb kni p nb kni port id p port id for i 0 i lt nb kni i Burst rx from eth nb rx rte eth rx burst port id 0 pkts burst PKT BURST SZ if unlikely nb rx gt PKT BURST S2 RTE LOG ERR APP Error receiving from eth n return Burst tx to kni num rte_kni_tx_burst p gt kni i pkts burst nb rx kni stats port id rx packets num rte kni handle request p gt kni i if unlikely num nb rx Free mbu
92. ation s arguments are parsed an mbuf pool is created It contains a set of mbuf objects that are used by the driver and the application to store network packets Create a pool of mbuf to store packets mbuf pool rte mempool create mbuf pool MBUF PER POOL MBUF SIZE 32 sizeof struct rte pktmbuf pool private rte pktmbuf pool init NULL rte pktmbuf init NULL rte socket id 0 if mbuf pool NULL rte panic s n rte strerror rte errno The rte mempool is a generic structure used to handle pools of objects In this case it is necessary to create a pool that will be used by the driver which expects to have some reserved space in the mempool structure sizeof struct rte pktmbuf pool private bytes The number of allocated pktmbufs is MBUF PER POOL With a size of MBUF SIZE each A per Icore cache of 32 mbufs is kept The memory is allocated in on the master Icore s Socket but it is possible to extend this code to allocate one mbuf pool per socket Two callback pointers are also given to the rte mempool create function e The first callback pointer is to rte pktmbuf pool init and is used to initialize the private data of the mempool which is needed by the driver This function is provided by the mbuf API but can be copied and extended by the developer e The second callback pointer given to rte mempool create is the mbuf initializer The default is used that is rte pktmbuf_init which is p
93. ations User Guide Document Number 328218 008 5 i n tel i Introduction 17 2 Compiling the ApplicatiOn iier rete ana 112 17 3 Running the Applicatiohi iu cose iret ent eget goed seas aaia aada Dada nin 112 17 3 1 Interactive mode petet gn ane etai aa aie Resa enek gai rasa aae 114 173 2 EXamM ple eres et m 115 17 4 Explanatio Mms A EA A MIS Du De e DER DR 115 18 Intel QuickAssist Technology Sample Application eee 117 TSL OVERVIEW sesini inio haute Dni era anke dida dceKq E Pane RD 117 EE m 118 18 2 Building the Application i oreet titer th nne de gpeee ta ge ne eR KR Pr Sada R 119 18 3 Running the Application 0 ccceeee cece eee memes 119 18 3 1 Intel QuickAssist Technology Configuration Files 119 18 3 2 Traffic Generator Setup and Application Startup 120 19 Quota and Watermark Sample Application sssse Hn 121 URMEDOT VITE 121 19 2 Compiling the ApplicatiOD iioii eese dene epe erbe R aea ae na FX ROCHE RE da x RN Y 123 19 3 Running the Application 0 ccceeee eee eee emen eene 123 19 3 1 Running the Core Application ssssssee HH 124 19 3 2 Running the Control Application sssseesee m 124 MES 124 19 4 1 Core Applicatiori QW prep sr Rkx e Due e nsu Pa Ep nada e RA 125 19 4 1 1 EAL and Drivers Setup sss 125 19 4 1 2 Shared Variables Setup
94. b aa evans 158 255 OVERVIGW m E 158 25 2 Compiling the Application esssssssssseen n meme 158 25 3 RUNNING the Applications iioii edoceri pine n SENE E RR AETERNE EEERE 159 25 3 1 Application Command Line sess 159 25 3 2 Table Types and Behavior eecesecesss eene enhn nnn nua 159 25 3 3 Input Traffic tenenti necessa grain hen desdenceneednencerecencegienss CR 161 Packet FLOW M n aa a nE OEN EE AE ENEE AEAEE Gai 15 Kernel NIC Application Packet Flow cccceceeeeeeee eee eset meme nnns 39 Performance Benchmark Setup Basic Environment sse 49 Performance Benchmark Setup Virtualized Environment sese 49 Load Balancer Application Architecture ccceeeee eee e eee eee nmn 89 Example Data Flow in a Symmetric Multi process Application c eceeeeee ee ee ee ees 97 Example Data Flow in a Client Server Symmetric Multi process Application 99 Master slave Process WOrkflOW ccceeeeeee eee enter eee messes nennen nn 101 Slave Process Recovery Process Flow ssssssssssssseeenenememememem enne eene 103 QoS Scheduler Application Architecture sss memes 111 Intel QuickAssist Technology Application Block Diagram eene 117 Intel Data Plane Development Kit Intel DPDK Sample Applications User Guide Document Number 328218 008 7 Figure 12 Figure 13 Figure 14 Figure 15 Figure 16
95. b app is running rather than the terminal from which the HUP signal was sent 88 Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 138 Document Number 328218 008 Vhost Sample Application n tel i 23 Vhost Sample Application The vhost sample application demonstrates integration of the Intel Data Plane Development Kit Intel DPDK with the Linux KVM hypervisor by implementing the vhost net offload API The sample application performs simple packet switching between virtual machines based on Media Access Control MAC address or Virtual Local Area Network VLAN tag The splitting of ethernet traffic from an external switch is performed in hardware by the Virtual Machine Device Queues VMDQ and Data Center Bridging DCB features of the Intel 82599 10 Gigabit Ethernet Controller 23 1 Background Virtio networking virtio net was developed as the Linux KVM para virtualized method for communicating network packets between host and guest It was found that virtio net performance was poor due to context switching and packet copying between host guest and QEMU The following figure shows the system architecture for a virtio based networking virtio net Figure 17 QEMU Virtio net prior to vhost net WwaysAs funesadg di Kernel Space The Linux Kernel vhost net module was developed as an offload mechanism for virtio net The vhost net module enables KVM QEMU to offload th
96. been possible to call the 1si send burst function directly from the main loop to send all the received packets on the same TX port using the burst oriented send function which is more efficient However in real life applications such as L3 routing packet N is not necessarily forwarded on the same port as packet N 1 The application is implemented to illustrate that so the same approach can be reused in a more complex application The 1si send packet function stores the packet in a per Icore and per txport table If the table is full the whole packets table is transmitted using the 1si send burst function Send the packet on an output interface static int lsi send packet struct rte mbuf m uint8 t port unsigned lcore id len struct lcore queue conf qconf lcore id rte lcore id qconf amp lcore queue conf lcore id len qconf tx mbufs port len qconf tx mbufs port m table len m len enough pkts to be sent if unlikely len MAX PKT BURST lsi_send_burst qconf MAX PKT BURST port len 0 qconf gt tx_mbufs port len len return 0 To ensure that no packets remain in the tables each Icore does a draining of the TX queue in its main loop This technique introduces some latency when there are not many packets to send However it improves performance cur tsc rte rdtsc TX burst queue drain diff tsc cur tsc prev tsc if unlike
97. bles The setup shared variables function retrieves the shared variables quota and low watermark from the rte memzone previously created by qw static void setup shared variables void const struct rte memzone qw memzone qw memzone rte memzone lookup QUOTA WATERMARK MEMZONE NAME if qw memzone NULL rte exit EXIT FAILURE Could t find memzone n quota qw memzone addr low watermark unsigned int qw memzone addr sizeof int 88 Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 130 Document Number 328218 008 Timer Sample Application i n te 21 21 1 21 2 21 3 21 3 1 June 2014 Timer Sample Application The Timer sample application is a simple application that demonstrates the use of a timer in an Intel DPDK application This application prints some messages from different Icores regularly demonstrating the use of timers Compiling the Application 1 Go to the example directory export RTE SDK path to rte sdk cd RTE_SDK examples timer 2 Set the target a default target is used if not specified For example export RTE TARGET x86 64 native linuxapp gcc See the Intel DPDK Getting Started Guide for possible RTE TARGET values 3 Build the application make Running the Application To run the example in linuxapp environment build timer c f n 4 Refer to the Intel DPDK Getting
98. c FF option each thread receives and forwards packets from 16 queues As supplied the sample application configures the VMDQ feature to have 16 pools with 8 queues each as indicated in Figure 16 The Intel 82599 10 Gigabit Ethernet Controller NIC also supports the splitting of traffic into 32 pools of 4 queues each and this can be used by changing the NUM POOLS parameter in the supplied code The NUM POOLS parameter can be passed on the command line after the EAL parameters build vmdq dcb EAL options p PORTMASK nb pools NP where NP can be 16 or 32 Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 134 Document Number 328218 008 VMDQ and DCB Forwarding Sample Application i n tel Figure 16 Packet Flow Through the VMDQ and DCB Sample Application mi 5 AXHXHT 9 51 0 3w i 1 amp 3RUW 0 5 5 4XHXH 120 9 8115 3ULR 0 H 5 4XHXH 127 3 51 15 3w i In Linux user space the application can display statistics with the number of packets received on each queue To have the application display the statistics send a SIGHUP signal to the running application process as follows where lt pid gt is the process id of the application process The VMDQ and DCB Forwarding sample application is in many ways simpler than the L2 Forwarding application see Chapter 9 L2 Forwarding Sample Application in Real and Virtualized Environments
99. contributes to a power efficient packet processing solution when serving light network loads The rte eth rx burst function and the newly added rte eth rx queue count function are used in the endless packet processing loop to return the number of received and available Rx descriptors And those numbers of specific queue are passed to P state and C state heuristic algorithms to generate hints based on recent network load trends Intel Data Plane Development Kit Intel DPDK June 2014 Sample Applications User Guide Document Number 328218 008 67 i n tel L3 Forwarding with Power Management Sample Application Note Only power control related code is shown static attribute noreturn int main loop attribute unused void dummy while 1 Read packet from RX queues m lcore scaleup hint FREQ CURRENT lcore rx idle count 0 for i 0 i lt qconf n rx queue i rx queue amp qconf rx queue list i rx queue idle hint 0 portid rx queue port id queueid rx queue queue id nb rx rte eth rx burst portid queueid pkts burst MAX PKT BURST stats lcore id nb rx processed nb rx if unlikely nb rx 0 _ no packet received from rx queue try to sleep for a while forcing CPU enter deeper C states f rx queue zero rx packet count tt if rx queue zero rx packet count lt MIN ZERO POLL COUNT continue rx queue idle hin
100. ction Setup Instructions sse 50 9 2 Compiling the Application esses emnes 50 9 3 Running the Application seins sisenecddieimdarnadviduesane ERE cde dx RR E Rad e MR dE 50 9 4 Explanatio Mesas E IT UE LIU TID MNT 51 9 4 1 Command Line Arguments seieeesses nianie siii adiri nana nua 51 9 4 2 Mbuf Pool InitializatiOh sansi coire rni men oin eR qst nire ere ec i 51 9 4 3 Driver InitializatlOn oio cc een ek cet oe te eur eta a Deva lorte cu ele 52 9 4 4 RX Queue Initialization ceeeseeeeeee enne nnns 53 9 4 5 TX Queue Initialization 0 cece eee ee eee eee eee nena ena eee nena eae 54 9 4 6 Receive Process and Transmit Packets ccccsseeeeseeeeeeeeeeeeneenes 54 L3 Forwarding Sample Application cccceeccce cece erence ree eee mnm 58 LOA OVERVIEW mE 58 10 2 Compiling the Application cece eee renee eee eee mmm nnn 58 10 3 RUNNING the Application iei dlc tnn he nbn aa 59 10 4 kExplanatioDr iiie irren teee ket aree e uiae t Da cesar DUN NONEA NAE EE EEEE 60 10 4 1 Hash InitializatiOn 5 1 ii ceri cote ine ee eoe te Sn Dna ea Tema opa v dRCE ide diac 60 10 4 2 LPM InitializatiOhi iios o e reae nicum ent mecnm cerae ma me t qe 61 10 4 3 Packet Forwarding for Hash based Lookups sees 62 10 4 4 Packet Forwarding for LPM based Lookups sse 63 L3 Forwarding with Power Mana
101. d giving the secondary process access to the hardware and software rings for each network port Similarly the memory pool of mbufs is accessed by doing a lookup for it by name mp proc type RTE PROC SECONDARY rte mempool lookup SMP MBUF POOL rte mempool create SMP MBUF POOL NB MBUFS MBUF SIZE Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 98 Document Number 328218 008 Multi process Sample Application i n tel Once this initialization is complete the main loop of each process both primary and secondary is exactly the same each process reads from each port using the queue corresponding to its proc id parameter and writes to the corresponding transmit queue on the output port 16 1 4 Client Server Multi process Example The third example multi process application included with the Intel DPDK shows how one can use a client server type multi process design to do packet processing In this example a single server process performs the packet reception from the ports being used and distributes these packets using round robin ordering among a set of client processes which perform the actual packet processing In this case the client applications just perform level 2 forwarding of packets by sending each packet out on a different network port The following diagram shows the data flow through the application using two client processes Figure 8 E
102. d as other Intel DPDK libraries See also the rte cmadline library should not be used in production code due to limited testing item in the Known Issues section of the Release Notes The Command Line sample application supports some of the features of the GNU readline library such as completion cut paste and some other special bindings that make configuration and debug faster and easier The application shows how the rte cmdline application can be extended to handle a list of objects There are three simple commands e add obj name IP Add a new object with an IP IPv6 address associated to it e del obj name Delete the specified object e show obj name Show the IP associated with the specified object Note To terminate the application use Ctrl d 2 2 June 2014 Compiling the Application 1 Go to example directory export RTE SDK path to rte sdk cd RTE SDK examples cmdline 2 Setthe target a default target is used if not specified For example export RTE TARGET x86 64 native linuxapp gcc Refer to the Intel DPDK Getting Started Guide for possible RTE TARGET values Intel Data Plane Development Kit Intel DPDK Sample Applications User Guide Document Number 328218 008 11 i n tel i Command Line Sample Application 2 3 2 4 2 4 1 2 4 2 3 Build the application make Running the Application To run the application in linuxapp environment issue the followin
103. d on the global vian tags array and dividing up the possible user priority values equally among the individual queues also referred to as traffic classes within each pool that is if the number of pools is 32 then the user priority fields are allocated two to a queue If 16 pools are used then each of the 8 user priority fields is allocated to its own queue within the pool For the VLAN IDs each one can be allocated to possibly multiple pools of queues so the pools parameter in the rte eth vmdq dcb conf structure is specified as a bitmask value const uintl16 t vlan tags Q 1 2 3 4 5 6 Ty 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 Builds up the correct configuration for vmdqtdcb based on the vlan tags array given above and the number of traffic classes available for use static inline int get eth conf struct rte eth conf eth conf enum rte eth nb pools num pools struct rte eth vmdq dcb conf conf unsigned i if num pools ETH 16 POOLS amp amp num pools ETH 32 POOLS return 1 conf nb queue pools conf enable default conf default pool set explicit value even if not used conf nb pool maps sizeof vlan tags sizeof vlan tags 0 for i 0 i lt conf nb pool maps itt conf pool map i vlan id vlan tags i conf pool map i pools 1 i num pools num pools pool 0 0 S for i
104. dk cd RTE_SDK examples quota_watermark 2 Set the target a default target is used if not specified For example export RTE TARGET x86 64 native linuxapp gcc See the Intel DPDK Getting Started Guide for possible RTE TARGET values 3 Build the application make 20 3 Running the Application The core application qw has to be started first Once it is up and running one can alter quota and watermarks while it runs using the control application qwct1 Intel Data Plane Development Kit Intel DPDK June 2014 Sample Applications User Guide Document Number 328218 008 123 n tel Quota and Watermark Sample Application 20 3 1 Running the Core Application The application requires a single command line option qw build qw EAL options p PORTMASK where p PORTMASK A hexadecimal bitmask of the ports to configure To run the application in a linuxapp environment with four logical cores and ports 0 and 2 issue the following command qw build qw c f n 4 p 5 Refer to the Intel DPDK Getting Started Guide for general information on running applications and the Environment Abstraction Layer EAL options 20 3 2 Running the Control Application The control application requires a number of command line options qwctl build qwctl EAL options proc type secondary The proc type secondary option is necessary for the EAL to properly initialize the control application to use the same huge pages as the
105. dy timed out If yes then free all previously received fragments and remove information about them from the entry 3 If no entry with such key is found then try to create a new one by one of two ways a Use as empty entry b Delete a timed out entry free mbufs associated with it mbu s and store a new entry with specified key in it 5 Update the entry with new fragment information and check if a packet can be reassembled the packet s entry contains all fragments a If yes then reassemble the packet mark table s entry as empty and return the reassembled mbuf to the caller b If no then just return a NULL to the caller If at any stage of packet processing a reassembly function encounters an error can t insert new entry into the Fragment table or invalid timed out fragment then it will free all associated with the packet fragments mark the table entry as invalid and return NULL to the caller Debug logging and Statistics Collection The RTE LIBRTE IP FRAG TBL STAT controls statistics collection for the IP Fragment Table This macro is disabled by default To make ip reassembly print the statistics to the standard output the user must send either an USR1 INT or TERM signal to the process For all of these signals the ip reassembly process prints Fragment table statistics for each RX queue plus the INT and TERM will cause process termination as usual 88 Intel Data Plane Development Kit
106. e kni port params array array which is indexed by the port ID The code is as follows static int parse config const char arg June 2014 const char p p0 arg char s 256 end unsigned size num fieldnames FLD PORT 0 FLD_LCORI FLD LCORE NUM FLD KNI MAX KTHREAD 3 int i j nb_token char str fld NUM FLD unsigned long int fld NUM FLD uint8 t port id nb kni port params 0 memset amp kni port params array 0 sizeof kni port params array while p strchr p0 NULL amp amp nb kni port params RTE MAX ETHPORTS ptt ooo if p0 strchr p NULL goto fail size p0 p if size gt sizeof s printf Invalid config parameters n goto fail Intel Data Plane Development Kit Intel DPDK Sample Applications User Guide Document Number 328218 008 43 for Kernel NIC Interface Sample Application rte snprintf s sizeof s s size p I nb token rte strsplit s sizeof s str fld NUM FLD if nb token FLD LCORE TX printf Invalid config parameters Mn goto fail for i 0 i lt nb token i errno 0 int fld i strtoul str fld i amp end 0 if errno 0 end str fld i printf Invalid config parameters Mn goto fail i 0 port_id uint8 t int_fld i if port
107. e KNI located in the kmod sub directory of the Intel DPDK target directory can be loaded with parameter of kthread mode as follows e insmod rte kni ko kthread_mode single This mode will create only one kernel thread for all KNI devices for packet receiving in kernel side By default it is in this single kernel thread mode It can set core affinity for this kernel thread by using Linux command taskset e insmod rte_kni ko kthread mode multiple This mode will create a kernel thread for each KNI device for packet receiving in kernel side The core affinity of each kernel thread is set when creating the KNI device The Icore ID for each kernel thread is provided in the command line of launching the application Multiple kernel thread mode can provide scalable higher performance To measure the throughput in a loopback mode the kernel module of the KNI located in the kmod sub directory of the Intel DPDK target directory can be loaded with parameters as follows e insmod rte_kni ko lo_mode lo_mode_fifo This loopback mode will involve ring enqueue dequeue operations in kernel space e insmod rte_kni ko lo_mode lo_mode_fifo_skb This loopback mode will involve ring enqueue dequeue operations and sk buffer copies in kernel space Running the Application The application requires a number of command line options kni EAL options P p PORTMASK config port lcore rx lcore tx lcore kthread port lcore r
108. e TX queue logical core on each port fflush stdout ret rte eth tx queue setup portid 0 nb txd rt th dev socket id portid amp tx conf if ret 0 rte exit EXIT FAILURE rte eth tx queue setup rr d port uNMn ret unsigned portid Intel Data Plane Development Kit Intel DPDK June 2014 Sample Applications User Guide Document Number 328218 008 85 14 4 7 Link Status Interrupt Sample Application The global configuration for TX queues is stored in a static structure static const struct rte eth txconf tx conf tx thresh pthresh TX PTHRESH hthresh TX HTHRESH wthresh TX WTHRESH tx free thresh RTE TEST TX DESC DEFAULT 1 disable feature Receive Process and Transmit Packets In the 1si_main_loop function the main task is to read ingress packets from the RX queues This is done using the following code Read packet from RX queues ur for i 0 i lt qconf n rx port i portid qconf rx port list i nb rx rte eth rx burst uint8 t portid 0 i pkts burst MAX PKT BURST port statistics portid rx nb rx for j 0 j nb rx Jr m pkts_burst j rte prefetch0O rte pktmbuf mtod m void lsi simple forward m portid Packets are read in a burst of size MAX PKT BURST The rte eth rx burst function writes the mbuf pointers in a local table and returns the
109. e frame buffer in vring and then lead to different Ethernet RX descriptor number could be used in zero copy mode So it is valid only in zero copy mode is enabled The value is 32 by default user target build app vhost switch c f n 4 huge dir mnt huge zero copy 1 rx desc num 0 n TX descriptor number The TX descriptor number option specify the Ethernet TX descriptor number it is valid only in zero copy mode is enabled The value is 64 by default user target build app vhost switch c f n 4 huge dir mnt huge zero copy 1 tx desc num 0 n Running the Virtual Machine QEMU QEMU must be executed with specific parameters to e Ensure the guest is configured to use virtio net network adapters user target qemu system x86 64 device virtio net pci netdev hostnetl id netl e Ensure the guest s virtio net network adapter is configured with offloads disabled user target qemu system x86 64 device virtio net pci netdev hostnetl id netl csum off gso off guest tso4 off guest tso6 off guest ecn off Intel Data Plane Development Kit Intel DPDK Sample Applications User Guide Document Number 328218 008 147 i n tel Vhost Sample Application Note e Redirect QEMU to communicate with the Intel DPDK vhost net sample code in place of the vhost net kernel module user target qemu system x86 64 netdev tap id hostnetl vhost on vhostfd open fd e En
110. e functions If a separate TX core is used these are sent to the TX ring Otherwise they are sent directly to the TX port The TX thread if present reads from the TX ring and write the packets to the TX port Intel Data Plane Development Kit Intel DPDK June 2014 Sample Applications User Guide Document Number 328218 008 111 intel 18 2 Compiling th QoS Scheduler Sample Application e Application To compile the application 1 Goto the sample application directory export RTE SDK path to rte sdk cd S RTE SDK examples qos sched 2 Setthe target a default target is used if not specified For example Note This application is intended as a linuxapp only export RTE TARG ET x86 64 native linuxapp gcc 3 Build the application make Note To get statistics on the sample app using the command line interface as described in the next section DPDK must be compiled defining CONFIG RTE SCHED COLLECT STATS which can be done by changing the configuration file for the specific target to be compiled 18 3 Running the Application Note In order to run the application a total of at least 4 G of huge pages must be set up for each of the used sockets depending on the cores in use The application has a number of command line options qos sched EAL options APP PARAMS gt Mandatory application parameters include e pfc RX PORT TX PORT RX LCORE WT LCORE TX
111. e servicing of virtio net devices to the vhost net kernel module reducing the context switching and packet copies in the virtual dataplane Intel Data Plane Development Kit Intel DPDK June 2014 Sample Applications User Guide Document Number 328218 008 139 i n te I Vhost Sample Application This is achieved by QEMU sharing the following information with the vhost net module through the vhost net API e The layout of the guest memory space to enable the vhost net module to translate addresses e The locations of virtual queues in QEMU virtual address space to enable the vhost module to read write directly to and from the virtqueues e An event file descriptor eventfd configured in KVM to send interrupts to the virtio net device driver in the guest This enables the vhost net module to notify call the guest e An eventfd configured in KVM to be triggered on writes to the virtio net device s Peripheral Component Interconnect PCI config space This enables the vhost net module to receive notifications kicks from the guest The following figure shows the system architecture for virtio net networking with vhost net offload Figure 18 Virtio with Linux Kernel Vhost Operating System NA 2 ioeventfd irafd Kernel Space 23 2 Sample Code Overview The Intel DPDK vhost net sample code demonstrates KVM QEMU offloading the servicing of a Virtual Machine s VM s virtio net devices to an Intel DPDK based a
112. e threads processes with the same core ID try to use the local cache Therefore unused core IDs from the passing of parameters with the c option are used to organize the core ID allocation array Once the floating process is spawned it tries to allocate a unique core ID from the array and release it on exit A natural way to spawn a floating process is to use the fork function and allocate a unique core ID from the unused core ID array However it is necessary to write new code to provide a notification mechanism for slave exit and make sure the process recovery mechanism can work with it To avoid producing redundant code the Master Slave process model is still used to spawn floating processes then cancel the affinity to specific cores Besides that clear the core ID assigned to the Intel DPDK spawning a thread that has a 1 1 mapping with the core mask Thereafter get a new core ID from the unused core ID allocation array Run the Application This example has a command line similar to the L2 Forwarding sample application with a few differences To run the application start one copy of the I2fwd fork binary in one terminal Unlike the L2 Forwarding example this example requires at least three cores since the master process will wait and be accountable for slave process recovery The command is as follows build l2fwd fork c 1c n 4 p 3 f This example provides another option to specify the use of floating proces
113. ed on actual hardware Rx queue size which are configured via the rte eth rx queue setup function 11 5 4 Note In general a thread needs to poll packets from multiple Rx queues Most likely different queue have different load so they would return different frequency hints The algorithm evaluates all the hints and then scales up frequency in an aggressive manner by scaling up to highest frequency as long as one Rx queue requires In this way we can minimize any negative performance impact On the other hand frequency scaling down is controlled in the timer callback function Specifically if the sleep times of a logical core indicate that it is sleeping more than 25 of the sampling period or if the average packet per iteration is less than expectation the frequency is decreased by one step C State Heuristic Algorithm Whenever recent rte eth rx burst polls return 5 consecutive zero packets an idle counter begins incrementing for each successive zero poll At the same time the function power idle heuristic is called to generate speculative sleep duration in order to force logical to enter deeper sleeping C state There is no way to control C state directly and the CPUIdle subsystem in OS is intelligent enough to select C state to enter based on actual sleep period time of giving logical core The algorithm has the following sleeping behavior depending on the idle counter e If idle count less than 100 the counter value
114. eeds to be built the final make command can be run just in that application s directory rather than at the top level multi process directory 16 1 2 16 1 2 1 Basic Multi process Example The examples simple mp folder in the Intel DPDK release contains a basic example application to demonstrate how two Intel DPDK processes can work together using queues and memory pools to share information Running the Application To run the application start one copy of the simple mp binary in one terminal passing at least two cores in the coremask as follows build simple mp c 3 n 4 proc type primary Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 94 Document Number 328218 008 Multi process Sample Application i n tel For the first Intel DPDK process run the proc type flag can be omitted or set to auto since all Intel DPDK processes will default to being a primary instance meaning they have control over the hugepage shared memory regions The process should start successfully and display a command prompt as follows build simple mp c 3 n 4 proc type primary EAL coremask set to 3 EAL Detected lcore 0 on socket O0 EAL Detected lcore 1 on socket O0 EAL Detected lcore 2 on socket 0 EAL Detected lcore 3 on socket 0 EAL Requesting 2 pages of size 1073741824 EAL Requesting 768 pages of size 2097152 EAL Ask a virtual are
115. er 2 Setthe target a default target is used if not specified Note This application is intended as a linuxapp only export RTE TARGET x86 64 native linuxapp gcc Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 108 Document Number 328218 008 QoS Metering Sample Application i n tel 3 Build the application make 17 3 Running the Application The application execution command line is as below qos meter EAL options p PORTMASK The application is constrained to use a single core in the EAL core mask and 2 ports only in the application port mask first port from the port mask is used for RX and the other port in the core mask is used for TX Refer to Intel DPDK Getting Started Guide for general information on running applications and the Environment Abstraction Layer EAL options 17 4 Explanation Selecting one of the metering modes is done with these defines define APP MODE FWD 0 define APP MODE SRTCM COLOR BLIND 1 define APP MODE SRTCM COLOR AWARE 2 define APP MODE TRTCM COLOR BLIND 3 define APP MODE TRTCM COLOR AWARE 4 define APP MODE APP MODE SRTCM COLOR BLIND To simplify debugging for example by using the traffic generator RX side MAC address based packet filtering feature the color is defined as the LSB byte of the destination MAC address The traffic meter parameters are configured in t
116. er HERREN ERIS TU REP EE ARCHI ERE 34 Intel Data Plane Development Kit Intel DPDK Sample Applications User Guide Document Number 328218 008 3 10 11 12 n tel i Introduction 7 4 Explanation ecce A O RE EA E i a 35 7 4 1 IPv4 Fragment Table Initialization eee 35 7 4 2 Mempools Initialization esses meme 36 7 4 3 Packet Reassembly and Forwarding sseeseeenn 36 7 4 4 Debug logging and Statistics Collection seseseeeeeeeeeese 37 Kernel NIC Interface Sample Application esesses enne 38 8 1 OVERVIEW 4 2 cesta ceiceneoa bat an ia Pria MR ER D RR ON CV NEQOE UN RUNE DK dA ad Ps ES 38 8 2 Compiling the Application esses mmm 39 8 3 Loading the Kernel Module cccccecseeenseeceeeesenseseneceeeeeananaeneaeeeeeeensennene 39 8 4 Running the Application cccecee cece cere eee memes nennen nn 40 8 5 KINT Operations wicencseroceteniiethuat E Ea hted date hemes 41 8 6 Explanation cuoco nett nix dudengedenseranteatestagensy CREDE MEM FAR QUEEN DEN MERE RENE ce 41 8 6 1 InitializZa tiO E 42 8 6 2 Packet Forwarding zixsoessci sedere een aces Ra a aa 45 8 6 3 Callbacks for Kernel Requests ssssseee eee 46 L2 Forwarding Sample Application in Real and Virtualized Environments 48 9 1 vu 48 9 1 1 Virtual Fun
117. es e Table type e g hash spec 16 ext or hash spec 16 Iru The available options are ext extendible bucket or Iru least recently used Table 3 Table Types TABLE_TYPE Description of Core B Table Pre added Table Entries 1 none Core B is not implementing an Intel DPDK pipeline Core B is implementing a pass through from its input set of software queues to its output set of software queues N A 2 stub Stub table Core B is implementing the same pass through functionality as described for the none option by using the Intel DPDK Packet Framework by N A June 2014 Document Number 328218 008 Intel Data Plane Development Kit Intel DPDK Sample Applications User Guide 159 Test Pipeline Application TABLE_TYPE Description of Core B Table Pre added Table Entries using one stub table for each input NIC port hash spec 8 Iru LRU hash table with 8 byte key size and 16 million entries hash spec 8 ext Extendible bucket hash table with 8 byte key size and 16 million entries 16 million entries are successfully added to the hash table with the following key format 4 byte index 4 bytes of 0 The action configured for all table entries is Send to output port with the output port index uniformly distributed for the range of output ports The default table rule used in the case of a lookup miss is to drop the packet
118. et id 0 clone pool rte mempool create clone pool NB CLONE MBUF CLONE MBUF SIZE 32 0 NULL NULL rte pktmbuf init NULL rte socket id 0 The reason for this is because indirect buffers are not supposed to hold any packet data and therefore can be initialized with lower amount of reserved memory for each buffer Hash Initialization The hash object is created and loaded with the pre configured entries read from a global array static int init mcast hash void uint32 t i mcast hash params socket id rte socket id mcast hash rte fbk hash create amp mcast hash params if mcast hash NULL return 1 for i 0 i lt N MCAST GROUPS i if rte fbk hash add key mcast hash mcast group table i ip mcast group table i port mask 0 return 1 return 0 Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 28 Document Number 328218 008 IPv4 Multicast Sample Application i n t e I 6 4 3 Forwarding All forwarding is done inside the mcast_forward function Firstly the Ethernet header is removed from the packet and the IPv4 address is extracted from the IPv4 header Remove the Ethernet header from the input packet iphdr struct ipv4 hdr rte pktmbuf adj m sizeof struct ether hdr RTE MBUF ASSERT iphdr NULL dest addr rte be to cpu 32 iphdr dst addr Then the packe
119. ff gu est tso6 off guest ecn off hda disk img m 4096 mem path dev hugepages mem prealloc Libvirt Integration The QEMU wrapper script qemu wrap py wraps libvirt calls to QEMU such that QEMU is called with the correct parameters described above To call the QEMU wrapper automatically from libvirt the following configuration changes must be made Place the QEMU wrapper script in libvirt s binary search PATH PATH A good location is in the directory that contains the QEMU binary Ensure that the script has the same owner group and file permissions as the QEMU binary Update the VM xml file using virsh edit vm name Set the VM to use the launch script Set the emulator path contained in the lt emulator gt lt emulator gt tags For example replace lt emulator gt usr bin qemu kvm lt emulator gt with lt emulator gt usr bin qemu wrap py lt emulator gt Set the VM s virtio net device s to use vhost net offload lt interface type network gt model type virtio gt driver name vhost gt lt interface gt Enable libvirt to access the Intel DPDK Vhost sample code s character device file by adding it to controllers cgroup for libvirtd using the following steps cgroup controllers devices clear emulator capabilities 0 user root group root cgroup device acl dev null dev full dev zero dev random dev urandom Intel
120. fic loss rate delay and jitter requirements such as data voice video or data transfers e A queue hosts packets from one or multiple connections of the same type belonging to the same user Intel Data Plane Development Kit Intel DPDK Sample Applications User Guide Document Number 328218 008 115 intel The traffic flows that need to be configured are application dependent This application classifies based on the QinQ double VLAN tags and the IP destination address as Table 2 indicated in the following table Entity Types QoS Scheduler Sample Application Level Name Siblings per QoS Functional Description Selected By Parent Port Ethernet port Physical port Subport Config 8 Traffic shaped token bucket Outer VLAN tag Pipe Config 4k Traffic shaped token bucket Inner VLAN tag Traffic Class 4 TCs of the same pipe services in Destination IP strict priority address 0 0 X 0 Queue 4 Queue of the same TC serviced in Destination IP WRR address 0 0 0 X Please refer to the QoS Scheduler chapter in the Intel DPDK Programmer s Guide for more information about these parameters Intel Data Plane Development Kit Intel DPDK Sample Application User Guide 116 88 June 2014 Document Number 328218 008 Intel QuickAssist Technology Sample Application n tel 19 Intel QuickAssist Technology Sample Application This sample application demonstrates the u
121. form ACL classify straight over incoming packet buffer no extra protocol field retrieval need to be performed Subsequently the application checks whether NUMA is enabled If it is the application records the socket IDs of the CPU cores involved in the task Finally the application creates contexts handler from the ACL library adds rules parsed from the file into the database and build an ACL trie It is important to note that the application creates an independent copy of each database for each socket CPU involved in the task to reduce the time for remote memory access Intel Data Plane Development Kit Intel DPDK Sample Applications User Guide Document Number 328218 008 77 a n tel L3 Forwarding in a Virtualization Environment Sample Application 13 L3 Forwarding in a Virtualization Environment Sample Application The L3 Forwarding in a Virtualization Environment sample application is a simple example of packet processing using the Intel DPDK The application performs L3 forwarding that takes advantage of Single Root I O Virtualization SR IOV features in a virtualized environment 13 1 Overview The application demonstrates the use of the hash and LPM libraries in the Intel DPDK to implement packet forwarding The initialization and run time paths are very similar to those of the L3 forwarding application see Chapter 10 L3 Forwarding Sample Application for more information The forwarding decision is taken based
122. forwarded to the correct port Overview The application demonstrates the use of the ACL library in the Intel DPDK to implement access control and packet L3 forwarding The application loads two types of rules at initialization e Route information rules which are used for L3 forwarding e Access Control List ACL rules that blacklist or block packets with a specific characteristic When packets are received from a port the application extracts the necessary information from the TCP IP header of the received packet and performs a lookup in the rule database to figure out whether the packets should be dropped in the ACL range or forwarded to desired ports The initialization and run time paths are similar to those of the L3 forwarding application see Chapter 10 L3 Forwarding Sample Application for more information However there are significant differences in the two applications For example the original L3 forwarding application uses either LPM or an exact match algorithm to perform forwarding port lookup while this application uses the ACL library to perform both ACL and route entry lookup The following sections provide more detail Classification for both IPv4 and IPv6 packets is supported in this application The application also assumes that all the packets it processes are TCP UDP packets and always extracts source destination port information from the packets Tuple Packet Syntax The application implements packet
123. from both NUMA nodes the platform has only two ports and the programmer wants to use one core from each processor socket to do the packet processing since only one Rx Tx queue pair can be used in virtualization mode To enable L3 forwarding between two ports using one core from each processor while also taking advantage of local memory accesses by optimizing around NUMA the programmer can pin to the appropriate cores and allocate memory from the appropriate NUMA node This is achieved using the following command build l3fwd vf c 0x03 n 3 p 0x3 config 0 0 0 1 0 1 In this command e The c option enables cores 0 and 1 Intel Data Plane Development Kit Intel DPDK Sample Applications User Guide Document Number 328218 008 79 13 4 Intel Data Plane Development Kit Intel DPDK Sample Application User Guide 80 e The p option enables ports 0 and 1 L3 Forwarding in a Virtualization Environment Sample Application e The config option enables one queue on each port and maps each port queue pair to a specific core Logic to enable multiple RX queues using RSS and to allocate memory from the correct NUMA nodes is included in the application and is done transparently The following table shows the mapping in this example Port Queue Icore Description 0 0 Map queue 0 from port 0 to Icore 0 1 1 Map queue 0 from port 1 to Icore 1 Refer to the Intel DPDK Getting Started Guide for
124. fs not tx to kni interface kni burst free mbufs amp pkts burst num nb rx num kni stats port id rx dropped nb rx num For the other case that reads from kernel NIC interfaces and writes to a physical NIC port packets are retrieved by reading mbufs from kernel NIC interfaces by Intel Data Plane Development Kit Intel DPDK June 2014 Sample Applications User Guide Document Number 328218 008 45 i n tel Kernel NIC Interface Sample Application rte kni rx burst The packet transmission is the same as in the L2 Forwarding sample application see Section 9 4 6 Receive Process and Transmit Packets on page 54 Interface to dequeue mbufs from tx q and burst tx x static void kni egress struct kni port params p uint8_t i nb_kni port_id unsigned nb tx num struct rte mbuf pkts burst PKT BURST 82 if p NULL return nb kni p nb kni port id p port id for i 0 i lt nb kni i Burst rx from kni num rte kni rx burst p kni i pkts burst PKT BURST SZ if unlikely num gt PKT BURST S2 RTE LOG ERR APP Error receiving from KNI n return Burst tx to eth nb tx rte eth tx burst port id 0 pkts burst uintl16 t num kni stats port id tx packets nb tx if unlikely nb tx num Free mbufs not tx to NIC kni burst free mbufs amp pkts burst nb tx num nb tx kni stats port id tx dropped num nb
125. g command build cmdline c f n 4 Refer to the Intel DPDK Getting Started Guide for general information on running applications and the Environment Abstraction Layer EAL options Explanation The following sections provide some explanation of the code EAL Initialization and cmdline Start The first task is the initialization of the Environment Abstraction Layer EAL This is achieved as follows int MAIN int argc char argv ret rte eal init argc argv if ret 0 rte panic Cannot init EAL n Then a new command line object is created and started to interact with the user through the console cl cmdline stdin new main ctx example cmdline interact cl cmdline stdin exit cl The cmdline interact function returns when the user types Ctrl d and in this case the application exits Defining a cmdline Context A cmdline context is a list of commands that are listed in a NULL terminated table for example cmdline parse ctx t main ctx cmdline parse inst t amp cmd obj del show cmdline parse inst t amp cmd obj add cmdline parse inst t amp cmd help NULL Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 12 Document Number 328218 008 Command Line Sample Application i n tel Each command of type cmdline parse inst t is defined statically It co ntains a pointer to a callback function that is executed when
126. gement Sample Application seeeee este ee ee ee eee 64 tist Introductio Nesi UE 64 G2 Og PE 64 11 3 Compiling the Application iiei e einen nente hanh rhe a a nh nn n n 65 11 4 JExplanatiODrs cien tannin teen TRU rEeH ERR aa ka ke a Erie DEN Ra ECKE RUNE RE neato 66 11 4 1 Power Library Initialization cece cece cece e eee eee mne 66 11 4 2 Monitoring Loads of Rx QUEUES ceceeeeeee eee e teens eee nemen 67 11 4 3 P State Heuristic Algorithm ceceeeee eect eee eee eee eee mmn 69 11 4 4 C State Heuristic AIQOrithm csi niania eda 70 L3 Forwarding in a Virtualization Environment Sample Application 71 iv MELOS 78 12 2 Compiling the Application eius cecinere thee a nsa nentur ea Tuan Dry kar nona 79 Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 4 Document Number 328218 008 Introduction 13 14 15 16 17 June 2014 intel 12 3 Running the AppliCatiOli i user iret rr rex oa sderdeacagersenaxaaadenacaes denne 79 12 4 ExplanatiOh iier an ERR Iba can LER URRE RETOUR Ea Ga RR TE 80 Link Status Interrupt Sample Application esesssse eem Hee 81 EMO aiian a a i a a a i a OEN 81 13 2 Compiling the Application eiiis sanana va a 81 13 3 Runhing th Application aieo e trea aaa oi a aa 82 13 4 ExplanatiOD rnnr a Saa Ea EE GE ADER EY DEUTE EAS 82 13 4
127. general information on running applications and the Environment Abstraction Layer EAL options Explanation The operation of this application is similar to that of the basic L3 Forwarding Sample Application See Section 10 4 Explanation on page 60 for more information 88 June 2014 Document Number 328218 008 Link Status Interrupt Sample Application i n tel 14 Link Status Interrupt Sample Application The Link Status Interrupt sample application is a simple example of packet processing using the Intel Data Plane Development Kit Intel DPDK that demonstrates how network link status changes for a network port can be captured and used by an Intel DPDK application 14 1 Overview The Link Status Interrupt sample application registers a user space callback for the link status interrupt of each port and performs L2 forwarding for each packet that is received on an RX PORT The following operations are performed e RX PORT and TX PORT are paired with available ports one by one according to the core mask e The source MAC address is replaced by the TX PORT MAC address e The destination MAC address is replaced by 02 00 00 00 00 TX PORT ID This application can be used to demonstrate the usage of link status interrupt and its user space callbacks and the behavior of L2 forwarding each time the link status changes 14 2 Compiling the Application 1 Goto the example directory export RTE SDK path to rte sdk cd RTE_SDK e
128. guments is done in the 12 wd parse args function The method of argument parsing is not described here Refer to the glibc getopt 3 man page for details EAL arguments are parsed first then application specific arguments This is done at the beginning of the main function init EAL ret rte eal init argc argv if ret 0 rte exit EXIT FAILURE Invalid EAL arguments n argo ret argv ret parse application arguments after the EAL ones ret l2fwd parse args argc argv if ret lt 0 rte exit EXIT FAILURE Invalid L2FWD arguments n 9 4 2 Mbuf Pool Initialization Once the arguments are parsed the mbuf pool is created The mbuf pool contains a set of mbuf objects that will be used by the driver and the application to store network packet data create the mbuf pool l2fwd pktmbuf pool rte mempool create mbuf pool NB MBUF MBUF SIZE 32 sizeof struct rte pktmbuf pool private rte pktmbuf pool init NULL rte pktmbuf init NULL SOCKETO 0 if 12fwd pktmbuf pool NULL rte panic Cannot init mbuf pool n The rte mempool is a generic structure used to handle pools of objects In this case it is necessary to create a pool that will be used by the driver which expects to have some reserved space in the mempool structure sizeof struct rte pktmbuf pool private bytes The number of allocated pktmbufs is NB MBUF with a size of MBUF SIZE eac
129. gure function is used to configure the number of queues for a port ret rte eth dev configure uint8 t portid 1 1 amp port conf if ret lt 0 rte exit EXIT FAILURE Cannot configure device err d port u Mn ret portid The global configuration is stored in a static structure static const struct rte eth conf port conf rxmode Split hdr size header split hw ip checksum hw vlan filter jumbo frame hw strip crc 0 lt Header Split disabled IP checksum offload disabled VLAN filtering disabled Jumbo Frame Support disabled lt CRC stripped by hardware lool o PEN oo txmode mq mode ETH DCB NONE ly he RX Queue Initialization The application uses one Icore to poll one or several ports depending on the q option which specifies the number of queues per Icore For example if the user specifies q 4 the application is able to poll four ports with one Icore If there are 16 ports on the target and if the portmask argument is p fff the application will need four Icores to poll all the ports ret rte eth rx queue setup uint8 t portid 0 nb rxd SOCKETO amp rx conf l12fwd pktmbuf pool if ret 0 rte exit EXIT FAILURE rte eth rx queue setup err Sd port uMn ret portid The list of queues that must be polled for a given Icore is stored in a private structure
130. h A per Icore cache of 32 mbufs is kept The memory is allocated in NUMA socket O0 but it is possible to extend this code to allocate one mbuf pool per socket Intel Data Plane Development Kit Intel DPDK June 2014 Sample Applications User Guide Document Number 328218 008 51 n tel L2 Forwarding Sample Application in Real and Virtualized Environments Two callback pointers are also given to the rte mempool create function e The first callback pointer is to rte pktmbuf pool init and is used to initialize the private data of the mempool which is needed by the driver This function is provided by the mbuf API but can be copied and extended by the developer e The second callback pointer given to rte mempool create is the mbuf initializer The default is used that is cte pktmbuf_init which is provided in the rte mbuf library If a more complex application wants to extend the rte pXktmbuf structure for its own needs a new function derived from rte pktmbuf init can be created 9 4 3 Driver Initialization The main part of the code in the main function relates to the initialization of the driver To fully understand this code it is recommended to study the chapters that related to the Poll Mode Driver in the Intel DPDK Programmer s Guide Rel 1 4 EAR and the Intel DPDK API Reference if rte eal pci probe 0 rte exit EXIT FAILURE Cannot probe PCI n nb ports rte eth dev count if
131. h 1 in value with number expected For example a range of 6 to 7 is enumerated as 0b110 and 0b111 Bit 1 7 are bits never changed and bit 0 is the bit dynamically changed Therefore set bit O in mask and value with 0 set bits 1 7 in mask with 1 and bits 1 7 in value with number Ob11 So mask is Oxfe value is Ox6 Note The library assumes that each field in the rule is in LSB or Little Endian order when creating the database It internally converts them to MSB or Big Endian order When performing a lookup the library assumes the input is in MSB or Big Endian order 12 1 2 Access Rule Syntax In this sample application each rule is a combination of the following e 5 tuple field This field has a format described in Section 12 1 1 e priority field A weight to measure the priority of the rules The rule with the higher priority will ALWAYS be returned if the specific input has multiple matches in the rule database Rules with lower priority will NEVER be returned in any cases e userdata field A user defined field that could be any value It can be the forwarding port number if the rule is a route table entry or it can be a pointer to a mapping address if the rule is used for address mapping in the NAT application The key point is that it is a useful reserved field for user convenience 12 1 3 ACL and Route Rules The application needs to acquire ACL and route rules before it runs Route rules are mandatory while ACL rules are opti
132. he q option which specifies the number of queues per Icore For example if the user specifies q 4 the application is able to poll four ports with one Icore If there are 16 ports on the target and if the portmask argument is p fff the application will need four Icores to poll all the ports ret rte eth rx queue setup uint8 t portid 0 nb rxd SOCKETO amp rx conf lsi pktmbuf pool if ret 0 rte exit EXIT FAILURE rte eth rx queue setup rr d port u Mn ret portid The list of queues that must be polled for a given Icore is stored in a private structure called struct lcore queue conf struct lcore queue conf unsigned n rx port unsigned rx port list MAX RX QUEUE PER LCORE unsigned tx queue id struct mbuf table tx mbufs LSI MAX PORTS rte cache aligned struct lcore queue conf lcore queue conf RTE MAX LCORE Then rx port and rx port list fields are used in the main packet processing loop see Section 14 4 7 Receive Process and Transmit Packets on page 86 later in this chapter The global configuration for the RX queues is stored in a static structure static const struct rte eth rxconf rx conf rx thresh pthresh RX PTHRESH hthresh RX HTHRESH wthresh RX WTHRESH hy he 14 4 6 TX Queue Initialization Each Icore should be able to transmit on any port For every port a single TX queue is initialized init on
133. he application source code with following default values struct rte meter srtcm params app srtcm params cir 1000000 46 cbs 2048 ebs 2048 struct rte meter trtcm params app trtcm params cir 1000000 46 pir 1500000 46 cbs 2048 pbs 2048 Assuming the input traffic is generated at line rate and all packets are 64 bytes Ethernet frames IPv4 packet size of 46 bytes and green the expected output traffic should be marked as shown in the following table Intel Data Plane Development Kit Intel DPDK June 2014 Sample Applications User Guide Document Number 328218 008 109 i n tel i QoS Metering Sample Application Table 1 Output Traffic Marking Mode Green Mpps Yellow Mpps Red Mpps SrTCM blind 1 1 12 88 srTCM color 1 1 12 88 trTCM blind 1 0 5 13 38 trTCM color 1 0 5 13 38 FWD 14 88 0 0 To set up the policing scheme as desired it is necessary to modify the main h source file where this policy is implemented as a static structure as follows int policer table e RTE METER COLORS e RTE METER COLORS GREEN RED RED DROP YELLOW RED DROP DROP RED he Where rows indicate the input color columns indicate the output color and the value that is stored in the table indicates the action to be taken for that particular case There are four different actions e
134. ication Then packets are dispatched to the destination ports according to the portmask associated with a multicast group for port 0 use clone port mask port mask gt gt 1 port Prepare output packet and send it out if port mask amp 1 0 if likely mc mcast out pkt m use clone NULL mcast send pkt mc amp dst eth addr as addr qconf port else if use clone 0 rte pktmbuf free m The actual packet transmission is done in the mcast send pkt function static inline void mcast send pkt struct rte mbuf pkt struct ether addr dest addr struct lcore queue conf qconf uint8 t port struct ether hdr ethdr uintl6 t len Construct Ethernet header ethdr struct ether hdr rte pktmbuf prepend pkt uintl6 t sizeof ethdr RTE MBUF ASSERT ethdr NULL ether addr copy dest addr amp ethdr d addr ether addr copy amp ports eth addr port amp ethdr s addr thdr gt ether type rte be to cpu 16 ETHER TYPE IPv4 Put new packet into the output queue len qconf tx mbufs port len qconf tx mbufs port m table len pkt qconf tx mbufs port len len Transmit packets if unlikely MAX PKT BURST len send burst qconf port Buffer Cloning This is the most important part of the application since it demonstrates the use of zero copy buffer cloning There are two approaches for creating the
135. id gt RTE MAX ETHPORTS printf Port ID u could not exceed the maximum u n port id RTE MAX ETHPORTS goto fail if kni port params array port id printf Port u has been configured n port id goto fail kni port params array port id struct kni port params rte zmalloc KNI port params sizeof struct kni port params CACHE LINE SIZE kni port params array port id port id port id kni port params array port_id gt lcore rx uint8 t int _fld i kni port params array port_id gt lcore tx uinte t int fld i if kni port params array port id l1core rx gt RTE MAX LCORE kni port params array port_id gt lcore tx gt RTE MAX LCORE printf lcore rx u or lcore tx u ID could not exceed the maximum u n kni port params array port id lcore rx kni port params array port id lcore tx RTE MAX LCORE M goto fail j 0 i lt nb token amp amp j lt KNI MAX KTHREAD i kni port params array port id lcore k j uint8 t int fld i kni port params array port id nb lcore k j j print_config return 0 fail for i 0 if Gis RTE MAX ETHPORTS i kni port params array i rte free kni port params array i kni port params array i NULL Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 Document Number 328218 008 Kernel NIC Interface Sample Applic
136. idle hint lt lcore idle hint lcore idle hint rx queue idle hint if lcore idle hint lt SLEEP GEAR1 THRESHOLD sk xecute pause instruction to avoid context switch for short sleep rte delay us lcore idle hint else long sleep force runing thread to suspend usleep lcore idle hint stats lcore id sleep time lcore idle hint 11 5 3 P State Heuristic Algorithm The power freq scaleup heuristic function is responsible for generating a frequency hint for the specified logical core according to available descriptor number returned from rte eth rx queue count On every poll for new packets the length of available descriptor on an Rx queue is evaluated and the algorithm used for frequency hinting is as follows e Ifthe size of available descriptors exceeds 96 the maximum frequency is hinted e If the size of available descriptors exceeds 64 a trend counter is incremented by 100 e If the length of the ring exceeds 32 the trend counter is incremented by 1 Intel Data Plane Development Kit Intel DPDK June 2014 Sample Applications User Guide Document Number 328218 008 69 i n tel L3 Forwarding with Power Management Sample Application e When the trend counter reached 10000 the frequency hint is changed to the next higher frequency Note The assumption is that the Rx queue size is 128 and the thresholds specified above must be adjusted accordingly bas
137. ination 0 128 0 0 9 L4 protocol ANY TCP source port ANY TCP destination port ANY gt send to output port 0 The default table rule used in the case of a lookup miss is to drop the packet 26 3 3 Input Traffic Regardless of the table type used for the core B pipeline the same input traffic can be used to hit all table entries with uniform distribution which results in uniform distribution of packets sent out on the set of output NIC ports The profile for input traffic is TCP IPv4 packets with e destination IP address as A B C D with A fixed to 0 and B C D random e Source IP address fixed to 0 0 0 0 June 2014 Document Number 328218 008 Intel Data Plane Development Kit Intel DPDK Sample Applications User Guide 161 i n tel d Test Pipeline Application e destination TCP port fixed to 0 e source TCP port fixed to 0 88 Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 162 Document Number 328218 008
138. ion specifies 3 memory channels e The p option enables ports 0 and 1 e The q option assigns 1 queue to each Icore Refer to the Intel DPDK Getting Started Guide for general information on running applications and the Environment Abstraction Layer EAL options 6 4 Explanation The following sections provide some explanation of the code As mentioned in the overview section the initialization and run time paths are very similar to those of the L2 Forwarding sample application see Chapter 9 L2 Forwarding Sample Application in Real and Virtualized Environments for more information The following sections describe aspects that are specific to the IPv4 Multicast sample application Intel Data Plane Development Kit Intel DPDK June 2014 Sample Applications User Guide Document Number 328218 008 27 6 4 2 IPv4 Multicast Sample Application Memory Pool Initialization The IPv4 Multicast sample application uses three memory pools Two of the pools are for indirect buffers used for packet duplication purposes Memory pools for indirect buffers are initialized differently from the memory pool for direct buffers packet pool rte mempool create packet pool NB PKT MBUF PKT MBUF SIZE 32 sizeof struct rte pktmbuf pool private rte pktmbuf pool init NULL rte pktmbuf init NULL rte socket id 0 header pool rte mempool create header pool NB HDR MBUF HDR MBUF SIZE 32 0 NULL NULL rte pktmbuf init NULL rte sock
139. is filled such that for each source port a destination port is assigned that is either the next or previous enabled port from the portmask Naturally the number of ports in the portmask must be even otherwise the application exits static void l2fwd simple forward struct rte mbuf m unsigned portid struct ether hdr eth void tmp unsigned dst_port dst_port 12fwd_dst_ports portid eth rte pktmbuf mtod m struct ether hdr 02 00 00 00 00 xx tmp amp eth d addr addr bytes 0 uint64 t tmp 0x000000000002 uint64 t dst port lt lt 40 src addr ether addr copy amp l2fwd ports eth addr dst port amp eth s addr l2fwd send packet m uint8 t dst port Then the packet is sent using the 12 wd send packet m dst port function For this test application the processing is exactly the same for all packets arriving on the same RX port Therefore it would have been possible to call the 12 wd send burst function directly from the main loop to send all the received packets on the same TX port using the burst oriented send function which is more efficient However in real life applications such as L3 routing packet N is not necessarily forwarded on the same port as packet N 1 The application is implemented to illustrate that so the same approach can be reused in a more complex application Intel Data Plane Development Kit Intel DPDK June 2014 Sample Applications User
140. is needed to store the allocated core ID for the floating process so that the master can return it after a slave has exited accidently static int 12fwd malloc shared struct void port statistics rte zmalloc port stat sizeof struct l2fwd port statistics RTE MAX ETHPORTS 0 if port statistics NULL return 1 allocate mapping id array if float proc int 25 mapping id rte malloc mapping id sizeof unsigned RTE MAX LCORE 0 if mapping id NULL return 1 for i 0 i lt RTE MAX LCORE i mapping id i INVALID MAPPING ID return 0 For each slave process packets are received from one port and forwarded to another port that another slave is operating on If the other slave exits accidentally the port it is operating on may not work normally so the first slave cannot forward packets to that port There is a dependency on the port in this case So the master should recognize the dependency The following is the code to detect this dependency for portid 0 portid lt nb ports portid skip ports that are not enabled if 12fwd enabled port mask amp 1 lt lt portid 0 continue Find pair ports lcores find lcore find pair lcore 0 pair port l12fwd dst ports portid for i 0 i lt RTE MAX LCORE i if rte lcore is enabled i continue for j 0 j lt lcore queue conf i n
141. is used as a microsecond sleep value through rte delay us which execute pause instructions to avoid costly context switch but saving power at the same time e If idle count is between 100 and 999 a fixed sleep interval of 100 us is used A 100 us sleep interval allows the core to enter the C1 state while keeping a fast response time in case new traffic arrives e If idle count is greater than 1000 a fixed sleep value of 1 ms is used until the next timer expiration is used This allows the core to enter the C3 C6 states The thresholds specified above need to be adjusted for different Intel processors and traffic profiles If a thread polls multiple Rx queues and different queue returns different sleep duration values the algorithm controls the sleep time in a conservative manner by sleeping for the least possible time in order to avoid a potential performance impact 88 Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 70 Document Number 328218 008 L3 Forwarding with Access Control Sample Application n tel 12 12 1 12 1 1 June 2014 L3 Forwarding with Access Control Sample Application The L3 Forwarding with Access Control application is a simple example of packet processing using the Intel DPDK The application performs a security check on received packets Packets that are in the Access Control List ACL which is loaded during initialization are dropped Others are
142. iss is to drop the packet At run time core A is creating the following lookup key and storing it into the packet meta data for Lpmcore B to use for table lookup destination IPv4 address 28 bytes of 0 Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 Document Number 328218 008 Test Pipeline Application intel TABLE_TYPE Description of Core B Table Pre added Table Entries 9 lpm Longest Prefix Match LPM IPv4 table In the case of two ports two routes are added to the table 0 0 0 0 9 gt send to output port 0 0 128 0 0 9 gt send to output port 1 In case of four ports four entries are added to the table 0 0 0 0 10 gt send to output port 0 0 64 0 0 10 gt send to output port 1 0 128 0 0 10 gt send to output port 2 0 192 0 0 10 gt send to output port 3 The default table rule used in the case of a lookup miss is to drop the packet At run time core A is storing the IPv4 destination within the packet meta data to be later used by core B as the lookup key 10 acl Access Control List ACL table In the case of two ports two ACL rules are added to the table priority O highest IPv4 source ANY IPv4 destination 0 0 0 0 9 L4 protocol ANY TCP source port ANY TCP destination port ANY gt send to output port 0 priority O highest IPv4 source ANY IPv4 dest
143. ive queue on each network port The secondary symmetric mp instances must also have these parameters specified and the first two must be the same as those passed to the primary instance or errors result Intel Data Plane Development Kit Intel DPDK Sample Applications User Guide Document Number 328218 008 97 n tel Multi process Sample Application For example to run a set of four symmetric mp instances running on Icores 1 4 all performing level 2 forwarding of packets between ports 0 and 1 the following commands can be used assuming run as root build symmetric mp c 2 n 4 proc type auto p 3 num procs 4 proc id 0 build symmetric mp c 4 n 4 proc type auto p 3 num procs 4 proc id 1 build symmetric mp c 8 n 4 proc type auto p 3 num procs 4 proc id 2 build symmetric mp c 10 n 4 proc type auto p 3 num procs 4 proc id 3 Note In the above example the process type can be explicitly specified as primary or secondary rather than auto When using auto the first process run creates all the memory structures needed for all processes irrespective of whether it has a proc id of 0 1 2 or 3 Note For the symmetric multi process example since all processes work in the same manner once the hugepage shared memory and the network ports are initialized it is not necessary to restart all processes if the primary instance dies Instead that process can be restarted as a seco
144. ket is dropped Forwarding Forwards the packet to the corresponding port e Final Phase Perform the following tasks Calls the EAL PMD driver and ACL library to free resource then quits Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 74 Document Number 328218 008 L3 Forwarding with Access Control Sample Application n tel 12 2 12 3 June 2014 Compiling the Application To compile the application 1 Goto the sample application directory export RTE SDK path to rte sdk cd S RTE SDK examples 13fwd acl 2 Setthe target a default target is used if not specified For example export RTE TARGET x86 64 native linuxapp gcc See the Intel DPDK IPL Getting Started Guide for possible RTE TARGET values 3 Build the application make Running the Application The application has a number of command line options build l3fwd acl EAL options p PORTMASK P config port queue lcore port queue lcore rule ipv4 FILENAME rule ipv6 FILENAME scalar enable jumbo max pkt len PKTLEN no numa where e p PORTMASK Hexadecimal bitmask of ports to configure e P Sets all ports to promiscuous mode so that packets are accepted regardless of the packet s Ethernet MAC destination address Without this option only packets with the Ethernet MAC destination address set to the Ethernet addre
145. lication in Real and Virtualized Environments Where this example application differs is in the configuration of the NIC port for RX The VMDQ and DCB hardware feature is configured at port initialization time by setting the appropriate values in the rte eth conf structure passed to the rte eth dev configure API Initially in the application a default structure is provided for VMDQ and DCB configuration to be filled in later by the application empty vmdq dcb configuration structure Filled in programatically static const struct rte eth conf vmdq dcb conf default rxmode mq mode ETH VMDQ DCB split hdr size 0 header split 0 lt Header Split disabled hw ip checksum 0 lt IP checksum offload disabled hw vlan filter 0 lt VLAN filtering disabled jumbo frame 0 Jumbo Frame Support disabled txmode mq mode ETH DCB NONE rx adv conf ET 3 should be overridden separately in code with appropriate values f vmdq dcb conf nb queue pools ETH 16 POOLS enable default pool 0 default pool 0 nb pool maps 0 pool map 0 0 dcb queue 0 Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 136 Document Number 328218 008 VMDQ and DCB Forwarding Sample Application i n tel The get eth conf function fills in an rte eth conf structure with the appropriate values base
146. ll determine which guest the packet should forward to or need send to external which bases on the packet destination MAC address and VLAN tag user target build app vhost switch c f n 4 huge dir mnt huge vm2vm 0 1 2 Mergeable Buffers The mergeable buffers parameter controls how virtio net descriptors are used for virtio net headers In a disabled state one virtio net header is used per packet buffer in an enabled state one virtio net header is used for multiple packets The default value is 0 or disabled since recent kernels virtio net drivers show performance degradation with this feature is enabled user target build app vhost switch c f n 4 huge dir mnt huge mergeable 0 1 Stats The stats parameter controls the printing of virtio net device statistics The parameter specifies an interval second to print statistics with an interval of 0 seconds disabling statistics user target build app vhost switch c f n 4 huge dir mnt huge stats 0 n RX Retry The rx retry option enables disables enqueue retries when the guests RX queue is full This feature resolves a packet loss that is observed at high data rates by allowing it to delay and retry in the receive path This option is enabled by default Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 146 Document Number 328218 008 Vhost Sample Application i n Le 23 7 June 2014 user t
147. ly diff tsc gt drain tsc this could be optimized use queueid instead of Intel Data Plane Development Kit Intel DPDK Sample Applications User Guide Document Number 328218 008 87 n tel i Link Status Interrupt Sample Application portid but it is not called so often for portid 0 portid lt RTE MAX ETHPORTS portidt if qconf tx mbufs portid len 0 continue lsi send burst amp lcore queue conf lcore id qconf tx mbufs portid len uint8 t portid qconf tx mbufs portid len 0 if timer is enabled if timer period 0 advance the timer timer tsc diff tsc if timer has reached its timeout if unlikely timer tsc gt uint64 t timer period do this only on master core if lcore id rte get master lcore print stats reset the timer timer tsc 0 prev_tsc cur_tsc 88 Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 88 Document Number 328218 008 Load Balancer Sample Application i n tel 15 15 1 Load Balancer Sample Application The Load Balancer sample application demonstrates the concept of isolating the packet I O task from the application specific workload Depending on the performance target a number of logical cores Icores are dedicated to handle the interaction with the NIC ports I O Icores while the rest of the Icores are dedicated to
148. m on the ovm rings appropriate port Porto 3 E 3 8S EiE RE GEE Hr ae The master loore is pulling l rs N 1 dequeues packets packets from the ports and from laore N 2 s rings and pladng them on rings engueues them on its ovn rings An adjustable quota value controls how many packets are being moved through the pipeline per enqueue and dequeue Adjustable watermark values associated with the rings control a back off mechanism that tries to prevent the pipeline from being overloaded by e Stopping enqueuing on rings for which the usage has crossed the high watermark threshold e Sending Ethernet pause frames e Only resuming enqueuing on a ring once its usage goes below a global low watermark threshold This mechanism allows congestion notifications to go up the ring pipeline and eventually lead to an Ethernet flow control frame being send to the source On top of serving as an example of quota and watermark usage this application can be used to benchmark ring based processing pipelines performance using a traffic generator as shown in Figure 14 Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 122 Document Number 328218 008 Quota and Watermark Sample Application i n te I Figure 14 Ring based Processing Pipeline Performance Setup Traffic Generator 20 2 Compiling the Application 1 Go to the example directory export RTE SDK path to rte s
149. mpile the application as follows 1 Goto the example directory export RTE SDK path to rte sdk cd S RTE SDK examples kni 2 Setthe target a default target is used if not specified Note This application is intended as a linuxapp only export RTE TARGET x86 64 native linuxapp gcc 3 Build the application make Loading the Kernel Module Loading the KNI kernel module without any parameter is the typical way an Intel DPDK application gets packets into and out of the kernel net stack This way only one kernel thread is created for all KNI devices for packet receiving in kernel side insmod rte kni ko Pinning the kernel thread to a specific core can be done using a taskset command such as following Intel Data Plane Development Kit Intel DPDK Sample Applications User Guide Document Number 328218 008 39 i n tel Kernel NIC Interface Sample Application 8 4 taskset p 100000 pgrep fl kni thread awk print 1 This command line tries to pin the specific kni thread on the 20th Icore Icore numbering starts at 0 which means it needs to check if that Icore is available on the board This command must be sent after the application has been launched as insmod does not start the kni thread For optimum performance the Icore in the mask must be selected to be on the same socket as the Icores used in the KNI application To provide flexibility of performance the kernel module of th
150. n tel Intel amp QuickAssist Technology Sample Application e Number of logical cores Icores e Mapping of NIC RX queues to logical cores Each Icore communicates with every cryptographic acceleration engine in the system through a pair of dedicated input output queues Each Icore has a dedicated NIC TX queue with every NIC port in the system Therefore each Icore reads packets from its NIC RX queues and cryptographic accelerator output queues and writes packets to its NIC TX queues and cryptographic accelerator input queues Each incoming packet that is read from a NIC RX queue is either directly forwarded to its destination NIC TX port forwarding path or first sent to one of the Intel QuickAssist Technology devices for either encryption or decryption before being sent out on its destination NIC TX port cryptographic path The application supports IPv4 input packets only For each input packet the decision between the forwarding path and the cryptographic path is taken at the classification stage based on the value of the IP source address field read from the input packet Assuming that the IP source address is A B C D then if e D 0 the forwarding path is selected the packet is forwarded out directly e D 1 the cryptographic path for encryption is selected the packet is first encrypted and then forwarded out e D 2 the cryptographic path for decryption is selected the packet is first decrypted and then forwarded out
151. n the shumway folder dh89xxcc ga devO conf dh89xxcc qa devi conf e Coleto Creek located in the coleto folder dh895xcc qa devO conf The relevant configuration file s must be copied to the etc directory Please note that any change to these configuration files requires restarting the Intel QuickAssist Technology driver using the following command service qat service restart Intel Data Plane Development Kit Intel DPDK Sample Applications User Guide Document Number 328218 008 119 i n tel Intel amp QuickAssist Technology Sample Application 19 3 2 Refer to the following documents for information on the Intel QuickAssist Technology configuration files e Intel Communications Chipset 8900 to 8920 Series Software Programmer s Guide e Intel Communications Chipset 8925 to 8955 Series Software Programmer s Guide e Intel Communications Chipset 8900 to 8920 Series Software for Linux Getting Started Guide e Intel Communications Chipset 8925 to 8955 Series Software for Linux Getting Started Guide Traffic Generator Setup and Application Startup The application has a number of command line options dpdk qat EAL options p PORTMASK no promisc config port queue lcore port queue lcore where e p PORTMASK Hexadecimal bitmask of ports to configure e no promisc Disables promiscuous mode for all ports so that only packets with the Ethernet MAC destination address set
152. nd 3 on next lcore reloaded manually lcore id rte get next lcore lcore id 0 1 rte timer reset amp timerl hz 3 SINGLE lcore id timerl cb NULL The callback for the first timer timer0 only displays a message until a global counter reaches 20 after 20 seconds In this case the timer is stopped using the rte timer stop function timerO0 callback static void timerO cb attribute unused struct rte timer tim attribute unused void arg static unsigned counter 0 unsigned lcore id rte lcore id printf s on lcore Su n FUNCTION lcore id this timer is automatically reloaded until we decide to stop it when counter reaches 20 if counter 20 rte timer stop tim The callback for the second timer timeri displays a message and reloads the timer on the next Icore using the rte timer reset function timerl callback static void timerl cb attribute unused struct rte timer tim attribute unused void arg unsigned lcore id rte lcore id uint64 t hz printf s on lcore uMn FUNCTION lcore id reload it on another lcore hz rte get hpet hz lcore id rte get next lcore lcore id 0 1 rte timer reset amp timerl hz 3 SINGLE lcore id timerl cb NULL 88 Intel Data Plane Development Kit Intel DPDK Sample Applications User Guide Document Number 328218 008 133 l n tel VM
153. ndary by explicitly setting the proc type to secondary on the command line All subsequent instances launched will also need this explicitly specified as auto detection will detect no primary processes running and therefore attempt to re initialize shared memory 16 1 3 2 How the Application Works The initialization calls in both the primary and secondary instances are the same for the most part calling the rte eal init 1 Gand 10 G driver initialization and then rte eal pci probe functions Thereafter the initialization done depends on whether the process is configured as a primary or secondary instance In the primary instance a memory pool is created for the packet mbufs and the network ports to be used are initialized the number of RX and TX queues per port being determined by the num procs parameter passed on the command line The structures for the initialized network ports are stored in shared memory and therefore will be accessible by the secondary process as it initializes if num ports amp 1 rte exit EXIT FAILURE Application must use an even number of ports Nin for i 0 i lt num ports i if proc type RTE PROC PRIMARY Lf smp port init ports il mp uintl16 t num procs lt 0 rte exit EXIT FAILURE Error initialising ports in In the secondary instance rather than initializing the network ports the port information exported by the primary process is use
154. nded to enable deeper C3 and C6 states in the BIOS during system boot up 11 3 June 2014 Compiling the Application To compile the application 1 Goto the sample application directory export RTE SDK path to rte sdk cd RTE_SDK examples 13fwd power 2 Setthe target a default target is used if not specified For example export RTE TARGET x86 64 native linuxapp gcc See the Intel DPDK Getting Started Guide for possible RTE TARGET values 3 Build the application Intel Data Plane Development Kit Intel DPDK Sample Applications User Guide Document Number 328218 008 65 i n tel L3 Forwarding with Power Management Sample Application make 11 4 Running the Application The application has a number of command line options build l3fwd power EAL options p PORTMASK P config port queue lcore port queue lcore enable jumbo max pkt len PKTLEN no numa where e p PORTMASK Hexadecimal bitmask of ports to configure e P Sets all ports to promiscuous mode so that packets are accepted regardless of the packet s Ethernet MAC destination address Without this option only packets with the Ethernet MAC destination address set to the Ethernet address of the port are accepted e config port queue lcore port queue lcore determines which queues from which ports are mapped to which cores e enable jumbo optional enables jumbo frames
155. ng applications and the Environment Abstraction Layer EAL options The number of bits set in each bitmask must be the same The coremask c parameter of the EAL options should include IN CORES and OUT CORES The same bit must not be set in IN CORES and OUT CORES The affinities between ports and cores are set beginning with the least significant bit of each mask that is the port represented by the lowest bit in PORTMASK is read from by the core represented by the lowest bit in IN CORES and written to by the core represented by the lowest bit in OUT CORES For example to run the application with two ports and four cores build exception path c f n 4 p 3 i 3 o c Getting Statistics While the application is running statistics on packets sent and received can be displayed by sending the SIGUSRI1 signal to the application from another terminal killall USR1 exception path The statistics can be reset by sending a SIGUSR2 signal in a similar way Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 16 Document Number 328218 008 Exception Path Sample Application n tel 3 4 Explanation The following sections provide some explanation of the code 3 4 1 Initialization Setup of the mbuf pool driver and queues is similar to the setup done in the L2 Forwarding sample application see Chapter 9 L2 Forwarding Sample Application in Real and Virtualized Environmen
156. nitializing port 0 on lcore 2 Address 00 1B 21 76 FA 2C rxq 0 txq 2 0 txq 4 1 done Link Up speed 10000 Mbps full duplex Skipping disabled port 1 Initializing port 2 on lcore 4 Address 00 1B 21 5C FF 54 rxq 0 txq 2 0 txq 4 1 done Link Up speed 10000 Mbps full duplex Skipping disabled port 3IP FRAG Socket 0 adding route 100 10 0 0 16 port 0 IP FRAG Socket 0 adding route 100 20 0 0 16 port 1 IP FRAG Socket 0 adding route 0101 0101 0101 0101 0101 0101 0101 0101 48 port 0 IP FRAG Socket 0 adding route 0201 0101 0101 0101 0101 0101 0101 0101 48 port 1 IP FRAG entering main loop on lcore 4 IP FRAG lcoreid 4 portid 2 IP FRAG entering main loop on lcore 2 IP FRAG lcoreid 2 portid 0 To run the example in linuxapp environment with 1 Icore 4 over 2 ports 0 2 with 2 RX queues per Icore Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 24 Document Number 328218 008 IP Fragmentation Sample Application i n te I build ip fragmentation c 0x10 n 3 p 5 q 2 To test the application flows should be set up in the flow generator that match the values in the 13fwd_ipv4 route array and or I3fwd ipv6 route array table The default 13fwd_ipv4 route array table is struct l3fwd ipv4 route 13fwd_ipv4 route array IPv4 100 10 0 0 16 O IPv4 100 20 0 0 16 1 IPv4 100 30 0 0 16 2 IPv4 100 40
157. nt Kit Intel DPDK Sample Application User Guide June 2014 64 Document Number 328218 008 L3 Forwarding with Power Management Sample Application n tel Processor performance states P states are the capability of an Intel processor to switch between different supported operating frequencies and voltages If configured correctly according to system workload this feature provides power savings CPUFreq is the infrastructure provided by the Linux kernel to control the processor performance state capability CPUFreq supports a user space governor that enables setting frequency via manipulating the virtual file device from a user space application The Power library in the Intel DPDK provides a set of APIs for manipulating a virtual file device to allow user space application to set the CPUFreq governor and set the frequency of specific cores This application includes a P state power management algorithm to generate a frequency hint to be sent to CPUFreq The algorithm uses the number of received and available Rx packets on recent polls to make a heuristic decision to scale frequency up down Specifically some thresholds are checked to see whether a specific core running an Intel DPDK polling thread needs to increase frequency a step up based on the near to full trend of polled Rx queues Also it decreases frequency a step if packet processed per loop is far less than the expected threshold or the thread s sleeping time exceeds a threshold
158. ntel DPDK The application does L3 forwarding with IPv4 and IPv6 packet fragmentation Overview The application demonstrates the use of zero copy buffers for packet fragmentation The initialization and run time paths are very similar to those of the L2 forwarding application see Chapter 9 L2 Forwarding Sample Application in Real and Virtualized Environments for more information This guide highlights the differences between the two applications There are three key differences from the L2 Forwarding sample application e The first difference is that the IP Fragmentation sample application makes use of indirect buffers e The second difference is that the forwarding decision is taken based on information read from the input packet s IP header e The third difference is that the application differentiates between IP and non IP traffic by means of offload flags The Longest Prefix Match LPM for IPv4 LPM6 for IPv6 table is used to store lookup an outgoing port number associated with that IP address Any unmatched packets are forwarded to the originating port By default input frame sizes up to 9 5 KB are supported Before forwarding the input IP packet is fragmented to fit into the standard Ethernet v2 MTU 1500 bytes Building the Application To build the application 1 Goto the sample application directory export RTE SDK path to rte sdk cd RTE_SDK examples ip fragmentation 4 Setthe target a default t
159. ntel DPDK Getting Started Guide for possible RTE TARGET values Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 144 Document Number 328218 008 Vhost Sample Application i n Le 3 Build the application make Note Note For zero copy need firstly disable CONFIG_RTE_MBUF_SCATTER_GATHER CONFIG_RTE_LIBRTE_IP_FRAG and CONFIG_RTE_LIBRTE_DISTRIBUTOR in the config file and then re configure and compile the core lib and then build the application vi S RTE SDK config common linuxapp change it as follows CONFIG RTE MBUF SCATTER GATHER n CONFIG RTE LIBRTE IP FRAG n CONFIG RTE LIBRTE DISTRIBUTOR n cd RTE_SDK make config RTE TARGET make install S RTE TARGET cd RTE SDK examples vhost make 4 Gotothe event d link directory cd S RTE SDK examples vhost net eventfd link 5 Build the event d link kernel module make 23 6 Running the Sample Code 1 Install the cuse kernel module modprobe cuse 2 Goto the eventfd link directory export RTE SDK path to rte sdk cd S RTE SDK examples vhost net eventfd link 3 Install the eventfd link module insmod eventfd link ko 4 Goto the examples directory export RTE SDK path to rte sdk cd S RTE SDK examples vhost net 5 Runthe vhost switch sample code Intel Data Plane Development Kit Intel DPDK June 2014 Sample
160. number of available mbufs in the table Then each mbuf in the table is processed by the 1si simple forward function The processing is very simple processes the TX port from the RX port and then replaces the source and destination MAC addresses Note In the following code the two lines for calculating the output port require some explanation If portId is even the first line does nothing as portid amp 1 will be 0 and the second line adds 1 If portId is odd the first line subtracts one and the second line does nothing Therefore O goes to 1 and 1 to O 2 goes to 3 and 3 to 2 and so on static void lsi simple forward struct rte mbuf m unsigned portid struct ether hdr eth void tmp unsigned dst port lsi dst ports portid eth rte pktmbuf mtod m struct ether hdr 02 00 00 00 00 xx Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 86 Document Number 328218 008 Link Status Interrupt Sample Application i n te I June 2014 tmp amp eth d addr addr bytes 0 uint64 t tmp 0x000000000002 dst port lt lt 40 src addr ether addr copy amp lsi ports eth addr dst port amp eth s addr lsi send packet m dst port Then the packet is sent using the 1si_send_ packet m dst port function For this test application the processing is exactly the same for all packets arriving on the same RX port Therefore it would have
161. nvironment Virtual Machine Host Machine Hypervisor Legend Note Port 0 3 initialized from PCI Virtual Function 0 3 Physical Function enabled in the Host Machine using ixgbe max vfsz2 2 Virtual Function Intel Data Plane Development Kit Intel DPDK June 2014 Sample Applications User Guide Document Number 328218 008 49 i n tel L2 Forwarding Sample Application in Real and Virtualized Environments 9 1 1 9 2 9 3 Virtual Function Setup Instructions This application can use the virtual function available in the system and therefore can be used in a virtual machine without passing through the whole Network Device into a guest machine in a virtualized scenario The virtual functions can be enabled in the host machine or the hypervisor with the respective physical function driver For example in a Linux host machine it is possible to enable a virtual function using the following command modprobe ixgbe max vfs 2 2 This command enables two Virtual Functions on each of Physical Function of the NIC with two physical ports in the PCI configuration space It is important to note that enabled Virtual Function 0 and 2 would belong to Physical Function 0 and Virtual Function 1 and 3 would belong to Physical Function 1 in this case enabling a total of four Virtual Functions Compiling the Application 1 Goto the example directory export RTE SDK path to rte sdk cd RTE_ SDK examples
162. ogical core the main task is to read ingress packets from the RX ports and enqueue them on the port s corresponding first ring in the pipeline This is done using the following code lcore id rte lcore id Process each port round robin style for port id 0 port id lt RTE MAX ETHPORTS port _id if lis bit set port id portmask continue ring rings lcore id port id if ring state port id RING READY if rte ring count ring low watermark continue else ring state port id RING READY Enqueue received packets on the RX ring Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 128 Document Number 328218 008 Quota and Watermark Sample Application i n tel nb rx pkts rte eth rx burst port id 0 pkts quota ret rte ring enqueue bulk ring void pkts nb rx pkts if ret EDQUOT ring state port id RING OVERLOADED send pause frame port id 1337 For each port in the port mask the corresponding ring s pointer is fetched into ring and that ring s state is checked e fitis in the RING READY state quota packets are grabbed from the port and put on the ring Should this operation make the ring s usage cross its high watermark the ring is marked as overloaded and an Ethernet flow control frame is sent to the Source e fitis not in the RING READY state this port is ignored until the ring s u
163. onal To simplify the complexity of the priority field for each rule all ACL and route entries are assumed to be in the same file To read data from the specified file successfully the application assumes the following e ach rule occupies a single line e Only the following four rule line types are valid in this application ACL rule line which starts with a leading character Route rule line which starts with a leading character R Comment line which starts with a leading character Empty line which consists of a space form feed f newline n carriage return r horizontal tab t or vertical tab Av Other lines types are considered invalid e Rules are organized in descending order of priority which means rules at the head of the file always have a higher priority than those further down in the file Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 72 Document Number 328218 008 L3 Forwarding with Access Control Sample Application i n tel 12 1 4 Figure 5 June 2014 e Atypical IPv4 ACL rule line should have a format as shown below Source Address Destination Address Source Port Dest Port Protocol A S anon aAa 192 168 0 34 32 192 168 0 36 32 0 65535 20 20 6 Oxfe IPv4 addresses are specified in CIDR format as specified in RFC 4632 They consist of the dot notation for the address and a prefix length separated by
164. opment Kit Intel DPDK Sample Application User Guide June 2014 62 Document Number 328218 008 L3 Forwarding Sample Application i n tel data 0 mm loadu sil128 m128i rte pktmbuf mtod m 0 unsigned char sizeof struct ether hdr offsetof struct ipv4 hdr time to live data 1 mm loadu sil128 m128i rte pktmbuf mtod m 1 unsigned char sizeof struct ether hdr offsetof struct ipv4 hdr time to live data 2 mm loadu sil128 m128i rte pktmbuf mtod m 2 unsigned char sizeof struct ether hdr offsetof struct ipv4 hdr time to live data 3 mm loadu sil128 m128i rte pktmbuf mtod m 3 unsigned char sizeof struct ether hdr offsetof struct ipv4 hdr time to live key 0 xmm mm and sil128 data 0 mask0 key 1 xmm mm and sil28 data 1 mask0 key 2 xmm mm and sii28 data 2 mask0 key 3 xmm mm and sil128 data 3 mask0 const void key array 4 amp key 0 amp key 1 amp key 2 amp key 3 rte hash lookup multi qconf ipv4 lookup struct amp key array 0 4 ret dst port 0 ret 0 lt 0 portid ipv4 l13fwd out if ret 0 dst port 1 ret 1 lt 0 portid ipv4 l13fwd out if ret 1 dst port 2 ret 2 lt 0 portid ipv4 13fwd_out_if ret 2 dst port 3 ret 3 lt 0 portid ipv4 l3fwd out if ret 3 The simple ipv6 fwd 4pkts function is similar to
165. pagesize nr hugepages The command above indicates how many hugepages are free to support QEMU s allocation request Running Intel DPDK in the Virtual Machine For the Intel DPDK vhost net sample code to switch packets into the VM the sample code must first learn the MAC address of the VM s virtio net device The sample code detects the address from packets being transmitted from the VM similar to a learning switch This behavior requires no special action or configuration with the Linux virtio net driver in the VM as the Linux Kernel will automatically transmit packets during device initialization However Intel DPDK based applications must be modified to automatically transmit packets during initialization to facilitate the Intel DPDK vhost net sample code s MAC learning Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 150 Document Number 328218 008 Vhost Sample Application i n tel j The Intel DPDK testpmd application can be configured to automatically transmit packets during initialization and to act as an L2 forwarding switch 23 8 1 Testpmd MAC Forwarding At high packet rates a minor packet loss may be observed To resolve this issue a wait and retry mode is implemented in the testpmd and vhost sample code In the wait and retry mode if the virtqueue is found to be full then testpmd waits for a period of time before retrying to enqueue packets The wait and ret
166. pile time When the selected lookup method is hash based a hash object is used to emulate the flow classification stage The hash object is used in correlation with a flow table to map each input packet to its flow at runtime The hash lookup key is represented by a DiffServ 5 tuple composed of the following fields read from the input packet Source IP Address Destination IP Address Protocol Source Port and Destination Port The ID of the output interface for the input packet is read from the identified flow table entry The set of flows used by the application is statically configured and loaded into the hash at initialization time When the selected lookup method is LPM based an LPM object is used to emulate the forwarding stage for IPv4 packets The LPM object is used as the routing table to identify the next hop for each input packet at runtime The LPM lookup key is represented by the Destination IP Address field read from the input packet The ID of the output interface for the input packet is the next hop returned by the LPM lookup The set of LPM rules used by the application is statically configured and loaded into the LPM object at initialization time In the sample application hash based forwarding supports IPv4 and IPv6 LPM based forwarding supports IPv4 only 10 2 Compiling the Application To compile the application 1 Goto the sample application directory export RTE SDK path to rte sdk cd RTE SDK examples
167. port d n port id return ret Restart specific port ret rte eth dev start port id if ret lt 0 RTE LOG ERR APP Fail to restart port dNMn port id return ret return 0 Callback for request of configuring network interface up down static int kni config network interface uint8 t port id uint8 t if up int ret 0 if port id gt rte eth dev count port id gt RTE MAX ETHPORTS RTE LOG ERR APP Invalid port id d n port id return EINVAL RTE LOG INFO APP Configure network interface of d sWn port id if up up down if if up 0 Configure network interface up rte eth dev stop port id ret rte eth dev start port id else Configure network interface down rte eth dev stop port id if ret 0 RTE LOG ERR APP Failed to start port dNMn port id return ret 88 Intel Data Plane Development Kit Intel DPDK June 2014 Document Number 328218 008 Sample Applications User Guide 47 i n tel L2 Forwarding Sample Application in Real and Virtualized Environments 9 1 L2 Forwarding Sample Application in Real and Virtualized Environments The L2 Forwarding sample application is a simple example of packet processing using the Intel Data Plane Development Kit Intel DPDK which also takes advantage of Single Root I O Virtualiza
168. pplication in place of the kernel s vhost net module Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 140 Document Number 328218 008 Vhost Sample Application i n tel The Intel DPDK vhost net sample code is a simple packet switching application with the following features e Management of virtio net device creation destruction events e Mapping of the VM s physical memory into the Intel DPDK vhost net sample code s address space e Triggering receiving notifications to from VMs via event fds e A virtio net back end implementation providing a subset of virtio net features e Packet switching between virtio net devices and the network interface card including using VMDQs to reduce the switching that needs to be performed in software The following figure shows the architecture of the Vhost sample application Figure 19 Vhost net Architectural Overview ga i dev vhost net us m T bo E i 2 Kernel The following figure shows the flow of packets through the vhost net sample application Intel Data Plane Development Kit Intel DPDK June 2014 Sample Applications User Guide Document Number 328218 008 141 i n tel Vhost Sample Application Figure 20 Packet Flow Through the vhost net Sample Application Vhost net Sample Code Virtual Machine 0 User Thread 2 Switch Core 23 3 Supported Distributions The example in this section have been
169. protocol 6 7 destined to port 1 e Rule 3 the third line tells the application to forward all packets to port O This is something like a default route entry As described earlier the application assume rules are listed in descending order of priority therefore Rule 1 has the highest priority then Rule 2 and finally Rule 3 has the lowest priority Intel Data Plane Development Kit Intel DPDK Sample Applications User Guide Document Number 328218 008 73 i n tel L3 Forwarding with Access Control Sample Application Consider the arrival of the following three packets e Packet 1 has source IP address 1 2 3 4 destination IP address 192 168 0 36 and protocol 6 e Packet 2 has source IP address 1 2 4 4 destination IP address 192 168 0 36 and protocol 6 e Packet 3 has source IP address 1 2 3 4 destination IP address 192 168 0 36 and protocol 8 Observe that e Packet 1 matches all of the rules e Packet 2 matches Rule 2 and Rule 3 e Packet 3 only matches Rule 3 For priority reasons Packet 1 matches Rule 1 and is dropped Packet 2 matches Rule 2 and is forwarded to port 1 Packet 3 matches Rule 3 and is forwarded to port 0 For more details on the rule file format please refer to rule_ipv4 db and rule ipv6 db files inside RTE SDK examples 13fwd acl 12 1 5 Application Phases Once the application starts it transitions through three phases e Initialization Phase
170. ps to do this are described in the Managing TAP Interfaces and Bridges section that follows 3 4 3 Managing TAP Interfaces and Bridges The Exception Path sample application creates TAP interfaces with names of the format tap dpdk nn where nn is the Icore ID These TAP interfaces need to be configured for use ifconfig tap dpdk 00 up To set up a bridge between two interfaces so that packets sent to one interface can be read from another use the prct1 tool brctl addbr bro brctl addif brO tap dpdk 00 brctl addif brO tap dpdk 03 ifconfig br0O up The TAP interfaces created by this application exist only when the application is running so the steps above need to be repeated each time the application is run To avoid this persistent TAP interfaces can be created using openvpn openvpn mktun dev tap dpdk 00 Intel Data Plane Development Kit Intel DPDK June 2014 Sample Applications User Guide Document Number 328218 008 19 i n tel Exception Path Sample Application If this method is used then the steps above have to be done only once and the same TAP interfaces can be reused each time the application is run To remove bridges and persistent TAP interfaces the following commands are used ifconfig brO down brctl delbr br0 openvpn rmtun dev tap dpdk 00 88 Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 20 Document Number 328218 008 Hello World Sample Application i n
171. r 328218 008 25 i n tel IPv4 Multicast Sample Application 6 IPv4 Multicast Sample Application The IPv4 Multicast application is a simple example of packet processing using the Intel Data Plane Development Kit Intel DPDK The application performs L3 multicasting 6 1 Overview The application demonstrates the use of zero copy buffers for packet forwarding The initialization and run time paths are very similar to those of the L2 forwarding application see Chapter 9 L2 Forwarding Sample Application in Real and Virtualized Environments for details more information This guide highlights the differences between the two applications There are two key differences from the L2 Forwarding sample application e The IPv4 Multicast sample application makes use of indirect buffers e The forwarding decision is taken based on information read from the input packet s IPv4 header The lookup method is the Four byte Key FBK hash based method The lookup table is composed of pairs of destination IPv4 address the FBK and a port mask associated with that IPv4 address For convenience and simplicity this sample application does not take IANA assigned multicast addresses into account but instead equates the last four bytes of the multicast group that is the last four bytes of the destination IP address with the mask of ports to multicast packets to Also the application does not consider the Ethernet addresses it looks only at
172. r Icores for packet forwarding This is a mandatory parameter 5 rsz A B C D Ring sizes a A The size in number of buffer descriptors of each of the NIC RX rings read by the I O RX Icores b B The size in number of elements of each of the software rings used by the I O RX Icores to send packets to worker Icores c C The size in number of elements of each of the software rings used by the worker Icores to send packets to I O TX Icores d D The size in number of buffer descriptors of each of the NIC TX rings written by I O TX Icores 4 bsz A B C D E F Burst sizes a A The I O RX Icore read burst size from NIC RX b B The I O RX Icore write burst size to the output software rings c C The worker Icore read burst size from the input software rings d D The worker Icore write burst size to the output software rings Intel Data Plane Development Kit Intel DPDK Sample Applications User Guide Document Number 328218 008 91 i n tel Load Balancer Sample Application 15 4 2 e E The I O TX Icore read burst size from the input software rings f F The I O TX Icore write burst size to the NIC TX 5 pos lb POS The position of the 1 byte field within the input packet used by the I O RX Icores to identify the worker Icore for the current packet This field needs to be within the first 64 bytes of the input packet The infrastructure of software
173. r 2 ports 0 2 with 2 RX queues per Icore Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 34 Document Number 328218 008 IP Reassembly Sample Application i n Le build ip reassembly c 0x10 n 3 p 5 q 2 To test the application flows should be set up in the flow generator that match the values in the 13fwd_ipv4 route array and or I3fwd ipv6 route array table Please note that in order to test this application the traffic generator should be generating valid fragmented IP packets For IPv6 the only supported case is when no other extension headers other than fragment extension header are present in the packet The default 13fwd_ipv4 route array table is struct l3fwd ipv4 route 13fwd_ipv4 route array IPv4 100 10 0 0 16 0 IPv4 100 20 0 0 16 1 IPv4 100 30 0 0 16 2 IPv4 100 40 0 0 16 3 IPv4 100 50 0 0 16 4 IPv4 100 60 0 0 16 5 IPv4 100 70 0 0 16 6 IPv4 100 80 0 0 16 7 i The default 13fwd ipv6 route array table is struct l3fwd ipv6 route l13fwd ipv6 route array 1 y 0 1 2 3 4 5 6 7 BRB PB DBD BAW CO CO CO CO CO CO CO MON GM OMS GM GM oM MOM ON GS GM GM oM OON ON OMS GM GM oM OON OMS GS GS GM oM ys NN GM GN GM oM MOON GN GM GM GM oM ys NNN GS OM oM ys NNN GM OM oM OON G
174. r general information on running applications and the Environment Abstraction Layer EAL options Note that unlike a traditional bridge or the 12 wd sample application no MAC address changes are done on the frames Do not forget to take that into account when configuring your traffic generators if you decide to test this sample application 88 Intel Data Plane Development Kit Intel DPDK Sample Applications User Guide Document Number 328218 008 155 i n tel Internet Protocol IP Pipeline Sample Application 25 25 1 25 2 25 3 Internet Protocol IP Pipeline Sample Application The Internet Protocol IP Pipeline application illustrates the use of the Intel DPDK Packet Framework tool suite The Intel DPDK pipeline methodology is used to implement functional blocks such as packet RX packet TX flow classification firewall routing IP fragmentation IP reassembly etc which are then assigned to different CPU cores and connected together to create complex multi core applications Overview The pipelines for packet RX packet TX flow classification firewall routing IP fragmentation IP reassembly management etc are instantiated and different CPU cores and connected together through software queues One of the CPU cores can be designated as the management core to run a Command Line Interface CLI to add entries to each table e g flow table firewall rule database routing table Address Resolution Protocol ARP
175. re 10 Slave Process Recovery Process Flow 16 1 5 3 June 2014 Floating Process Support When the Intel DPDK application runs there is always a c option passed in to indicate the cores that are enabled Then the Intel DPDK creates a thread for each enabled core By doing so it creates a 1 1 mapping between the enabled core and each thread The enabled core always has an ID therefore each thread has a unique core ID in the Intel DPDK execution environment With the ID each thread can easily access the structures or resources exclusively belonging to it without using function parameter passing It can easily use the rte 1core id function to get the value in every function that is called Intel Data Plane Development Kit Intel DPDK Sample Applications User Guide Document Number 328218 008 103 n tel Multi process Sample Application 16 1 5 4 For threads processes not created in that way either pinned to a core or not they will not own a unique ID and the rte 1core id function will not work in the correct way However sometimes these threads processes still need the unique ID mechanism to do easy access on structures or resources For example the Intel DPDK mempool library provides a local cache mechanism refer to Intel DPDK Programmer s Guide Section 6 4 Local Cache for fast element allocation and freeing If using a non unique ID or a fake one a race condition occurs if two or mor
176. re main is called in case of a Linuxapp environment The argc and argv arguments are provided to the rte eal init function The value returned is the number of parsed arguments Starting Application Unit Lcores Once the EAL is initialized the application is ready to launch a function on an Icore In this example 1core hello is called on every available Icore The following is the definition of the function static int lcore hello attribute unused void arg unsigned lcore id lcore id rte lcore id printf hello from core u n lcore id return 0 The code that launches the function on each Icore is as follows call lcore hello on every slave lcore RTE LCORE FOREACH SLAVE lcore id rte eal remote launch lcore hello NULL lcore id call it on master lcore too lcore hello NULL The following code is equivalent and simpler rte eal mp remote launch lcore hello NULL CALL MASTER Refer to the Intel DPDK API Reference for detailed information on the rte eal mp remote launch function 88 Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 22 Document Number 328218 008 IP Fragmentation Sample Application i n tel i 5 1 5 2 June 2014 IP Fragmentation Sample Application The IPv4 Fragmentation application is a simple example of packet processing using the Intel Data Plane Development Kit I
177. re varied since the dependency cases are complex It depends on what the processed had shared However it is necessary to notify the peer s if one slave exited Then the peer s will be aware of that and wait until the new instance begins to run Therefore to provide the capability to resume the new slave instance if the previous one exited it is necessary to provide several mechanisms 1 Keep a resource list for each slave process Before a slave process run the master should prepare a resource list After it exits the master could either delete the allocated resources and create new ones or re initialize those for use by the new instance 2 Set up a notification mechanism for slave process exit cases After the specific slave leaves the master should be notified and then help to create a new instance This mechanism is provided in Section 16 1 5 1 Master slave Process Models on page 101 3 Use a synchronization mechanism among dependent processes The master should have the capability to stop or kill slave processes that have a dependency on the one that has exited Then after the new instance of exited slave process begins to run the dependency ones could resume or run from the start The example sends a STOP command to slave processes dependent on the exited one then they will exit Thereafter the master creates new instances for the exited slave processes The following diagram describes slave process recovery Figu
178. read If configured one or more Icore IDs one or more KNI devices will be allocated for each port while specific Icore affinity will be set for its kernel thread In single kernel thread mode if configured none a KNI device will be allocated for each port If configured one or more Icore IDs one or more KNI devices will be allocated for each port while no Icore affinity will be set as there is only one kernel thread for all KNI devices For example to run the application with two ports served by six Icores one Icore of RX one Icore of TX and one Icore of kernel thread for each port build kni c 0Oxf0 n 4 P p 0x3 config 0 4 6 8 1 5 7 9 KNI Operations Once the KNI application is started one can use different Linux commands to manage the net interfaces If more than one KNI devices configured for a physical port only the first KNI device will be paired to the physical device Operations on other KNI devices will not affect the physical port handled in user space application Assigning an IP address ifconfig vEthO 0 192 168 0 1 Displaying the NIC registers ethtool d vEthO0 0 Dumping the network traffic tcpdump i vEthO 0 When the Intel DPDK userspace application is closed all the KNI devices are deleted from Linux Explanation The following sections provide some explanation of code Intel Data Plane Development Kit Intel DPDK Sample Applications User Guide Document Number
179. rom the L2 Forwarding sample application e The first difference is that the forwarding decision is taken based on information read from the input packet s IP header e The second difference is that the application differentiates between IP and non IP traffic by means of offload flags 7 2 The Longest Prefix Match LPM for IPv4 LPM6 for IPv6 table is used to store lookup an outgoing port number associated with that IPv4 address Any unmatched packets are forwarded to the originating port Compiling the Application To compile the application 1 Goto the sample application directory export RTE SDK path to rte sdk cd RTE_SDK examples ip reassembly 2 Set the target a default target is used if not specified For example export RTE TARGET x86 64 native linuxapp gcc See the Intel DPDK Getting Started Guide for possible RTE TARGET values 3 Build the application Intel Data Plane Development Kit Intel DPDK June 2014 Sample Applications User Guide Document Number 328218 008 33 7 3 IP Reassembly Sample Application make Running the Application The application has a number of command line options build ip reassembly EAL options p PORTMASK q NQ maxflows FLOWS flowttl TTL s ms where e p PORTMASK Hexadecimal bitmask of ports to configure e q NQ Number of RX queues per Icore e maxflows FLOWS determines maximum number of active fragmented flows 1 65535 Default value
180. roni rna nux kem e Ri cn 21 4 2 Running the AppliCatiOP uus ire ta ccaveranvearewcaveadenad peeves EEAO EEE RYAN cree 21 4 3 Explanation ETT T ILI UIT 21 4 3 1 zu 21 4 3 2 Starting Application Unit LCOreS cccceeeee cece eect eee mee 22 IPv4 Fragmentation Sample Application cceeeeee cece eee eee eee teense nemen 23 5 1 OVERVIEW EEUU 23 5 2 Building the Application israr cccccccceesense enne nnne nhnhnn nnn nh nnne nena nnn n n 23 5 3 RUNNING the Application eiii iter operado EAE e E FE OR EdA 24 IPv4 Multicast Sample Application sssssssssseseeenemmmmne eene 26 6 1 OVEFVIGW iiic dined ia a RR EX REA A RN M ERIM did ea RENS pic E RE RR O 26 6 2 Building the Application leise seen eene nennen nnn nnn hn hen na 26 6 3 RUNNING the Application iiie tre rmn opaco Er x RE admoneo Re Rar eMe RE coer 27 6 4 Explanation sssaaa used og deus suat itus un baleuinn biden xU COE RUD Eee DR DER VER 27 6 4 1 Memory Pool Initialization esses 28 6 4 2 Hash TInitialization iuo eorr reor ont repe kk ee Re kan e RR EEeR eR psu needs 28 6 4 3 FOPWAlGING ep 29 6 4 4 Buffer Cloning nisus scii pese rur R RR eben dai e bise eS 30 IP Reassembly Sample Application sss eene 33 7 1 diui EEEE EE E E E a E E E ETEN 33 7 2 Compiling the Application ces arinina aei na ka c n sar Rara AN 33 7 3 Rurinirg the Applicationg s eerie mes s
181. rovided in the rte mbuf library If a more complex application wants to extend the rte pktmbut structure for its own needs a new function derived from rte pktmbuf init can be created Ports Configuration and Pairing Each port in the port mask is configured and a corresponding ring is created in the master Icore s array of rings This ring is the first in the pipeline and will hold the packets directly coming from the port Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 126 Document Number 328218 008 Quota and Watermark Sample Application I n tel 20 4 1 6 20 4 1 7 June 2014 for port id 0 port id lt RTE MAX ETHPORTS port id t if is bit set port id portmask configure eth port port id init ring master lcore id port id pair ports The configure eth port and init ring functions are used to configure a port and a ring respectively and are defined in init c They make use of the Intel DPDK APIs defined in rte etn h and rte ring h pair ports builds the port pairs array so that its key value pairs are a mapping between reception and transmission ports It is defined in init c Logical Cores Assignment The application uses the master logical core to poll all the ports for new packets and enqueue them on a ring associated with the port Each logical core except the last runs pipeline stage after a ring for each used port is initiali
182. rte mempool lookup The network ports for use by the processes are obtained by loading the network port drivers and probing the PCI bus which will as in the symmetric multi process example automatically get access to the network ports using the settings already configured by the primary server process Once all applications are initialized the server operates by reading packets from each network port in turn and distributing those packets to the client queues software rings one for each client process in round robin order On the client side the packets are read from the rings in as big of bursts as possible then routed out to a different network port The routing used is very simple All packets received on the first NIC port are transmitted back out on the second port and vice versa Similarly packets are routed between the 3 and 4 network ports and so on The sending of packets is done by writing the packets directly to the network ports they are not transferred back via the server process In both the server and the client processes outgoing packets are buffered before being sent so as to allow the sending of multiple packets in a single burst to improve efficiency For example the client process will buffer packets to send until either the buffer is full or until we receive no further packets from the server Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 100 Document Number 328218 008
183. rte snprintf s sizeof s IPV4 L3FWD LPM d socketid ipv4 l3fwd lookup struct socketid rte lpm create s socketid IPV4 L3FWD LPM MAX RULES 0 if ipv4 13fwd lookup struct socketid NULL rte exit EXIT FAILURE Unable to create the 13fwd LPM table on socket dWMn socketid populate the LPM table for i 0 i lt IPV4 L3FWD NUM ROUTES i skip unused ports if 1 ipv4 13fwd route array i if out amp enabled port mask 0 continue ret rte lpm add ipv4 l13fwd lookup struct socketid ipv4 l3fwd route array i ip ipv4 l3fwd route array i depth ipv4 l3fwd route array i if out Intel Data Plane Development Kit Intel DPDK June 2014 Sample Applications User Guide Document Number 328218 008 61 n tel j L3 Forwarding Sample Application if ret 0 rte exit EXIT FAILURE Unable to add entry u to the 13fwd LPM table on socket d n i socketid printf LPM Adding route 0x 08x d d n unsigned ipv4 1l3fwd route array i ip ipv4 l3fwd route array i depth ipv4 l3fwd route array i if out fendif 10 4 3 Packet Forwarding for Hash based Lookups For each input packet the packet forwarding operation is done by the 13fwd simple forward Orsimple ipv4 fwd 4pkts function for IPv4 packets or the simple ipv6 fwd 4pkts function for IPv6 packets The 13fwd simple forward function provides the basic functionality
184. rwarding application testpmd set fwd mac retry The testpmd can then be configured to start processing packets transmitting packets first so the Intel DPDK vhost sample code on the host can learn the MAC address testpmd start tx first Note Please note set fwd mac retry is used in place of set fwd mac fwd to ensure the retry feature is activated 23 9 Passing Traffic to the Virtual Machine Device For a virtio net device to receive traffic the traffic s Layer 2 header must include both the virtio net device s MAC address and VLAN tag The Intel DPDK sample code behaves in a similar manner to a learning switch in that it learns the MAC address of the virtio net devices from the first transmitted packet On learning the MAC address the Intel DPDK vhost sample code prints a message with the MAC address and VLAN tag virtio net device For example DATA 0 MAC ADDRESS cc bb bb bb bb bb and VLAN TAG 1000 registered The above message indicates that device 0 has been registered with MAC address cc bb bb bb bb bb and VLAN tag 1000 Any packets received on the NIC with these values is placed on the devices receive queue When a virtio net device transmits packets the VLAN tag is added to the packet by the Intel DPDK vhost sample code 88 Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 152 Document Number 328218 008 Netmap Compatibility Sample Application i n t el 24
185. rx port j Intel Data Plane Development Kit Intel DPDK Sample Applications User Guide Document Number 328218 008 105 n tel Multi process Sample Application if lcore queue conf i rx port list j portid lcore i find lcore 1 break if lcore queue conf i rx port list j pair port pair lcore i find pair lcore 1 break if find_lcore amp amp find pair lcore break if find lcore find pair lcore rte exit EXIT FAILURE Not find port d pair n portid printf lcore u and u paired n lcore pair lcore lcore resource lcore pair id pair lcore lcore resource pair lcore pair id lcore Before launching the slave process it is necessary to set up the communication channel between the master and slave so that the master can notify the slave if its peer process with the dependency exited In addition the master needs to register a callback function in the case where a specific slave exited for i 0 i lt RTE MAX LCORE i I if lcore resource i enabled Create ring for master and slave communication ret create ms ring i if ret 0 rte exit EXIT FAILURE Create ring for lcore u failed i if flib register slave exit notify i slave exit cb 0 rte exit EXIT FAILURE Register master trace slave exit failed After launching the slave process the master waits and prints out
186. ry export RTE SDK path to rte sdk cd S RTE SDK app test test pipelin Set the target a default target is used if not specified export RTE TARGET x86 64 native linuxapp gcc Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 158 Document Number 328218 008 Test Pipeline Application 3 Build the application make 26 3 Running the Application 26 3 1 Application Command Line The application execution command line is test pipeline EAL options p PORTMASK TABLE TYPE The c EAL CPU core mask option has to contain exactly 3 CPU cores The first CPU core in the core mask is assigned for core A the second for core B and the third for core C The PORTMASK parameter must contain 2 or 4 ports 26 3 2 Table Types and Behavior Table 3 describes the table types used and how they are populated The hash tables are pre populated with 16 million keys For hash tables the following parameters can be selected e Configurable key size implementation or fixed specialized key size implementation e g hash 8 ext or hash spec 8 ext The key size specialized implementations are expected to provide better performance for 8 byte and 16 byte key sizes while the key size non specialized implementation is expected to provide better performance for larger key sizes e Key size e g hash spec 8 ext or hash spec 16 ext The available options are 8 16 and 32 byt
187. ry algorithm is implemented in Intel DPDK testpmd as a forwarding method call nac retry The following sequence diagram describes the algorithm in detail Figure 21 Packet Flow on TX in DPDK testpmd Get burst of packets from the Interface Modify the source and destination MAC addresses of each packet If there are not enough descriptors then wait and try Check if there are enough again a defined number of free descriptors times Add descriptors with new buffer addresses to the available ring DPDK VHOST 23 8 2 Running Testpmd The testpmd application is automatically built when Intel DPDK is installed Run the testpmd application as follows Intel Data Plane Development Kit Intel DPDK June 2014 Sample Applications User Guide Document Number 328218 008 151 i n tel Vhost Sample Application user target x86 64 native linuxapp gcc app testpmd c 0x3 n 4 Socket mem 128 burst 64 i The destination MAC address for packets transmitted on each port can be set at the command line user target x86 64 native linuxapp gcc app testpmd c 0x3 n 4 socket mem 128 burst 64 i eth peer 0 aa bb cc dd ff eth peer 1 ff dd cc bb aa e Packets received on port 1 will be forwarded on port 0 to MAC address aa bb cc dd ee ff e Packets received on port 0 will be forwarded on port 1 to MAC address ff ee dd cc bb aa The testpmd application can then be configured to act as an L2 fo
188. s If not specified the example will use a pinned process to perform the L2 forwarding task To verify the recovery mechanism proceed as follows First check the PID of the slave processes ps fe grep 12fwd fork root 5136 4843 29 11 11 pts 1 00 00 05 build l2fwd fork root 5145 5136 98 11 11 pts 1 00 00 11 build l2fwd fork root 5146 5136 98 11 11 pts 1 00 00 11 build l2fwd fork Then kill one of the slaves kill 9 5145 Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 104 Document Number 328218 008 Multi process Sample Application i n te I l 16 1 5 5 June 2014 After 1 or 2 seconds check whether the slave has resumed ps fe grep 12fwd fork root 5136 4843 3 11 11 pts 1 00 00 06 build l2fwd fork root 5247 5136 99 11 14 pts 1 00 00 01 build l2fwd fork root 5248 5136 99 11 14 pts 1 00 00 01 build l2fwd fork It can also monitor the traffic generator statics to see whether slave processes have resumed Explanation As described in previous sections not all global and static variables need to change to be accessible in multiple processes it depends on how they are used In this example the statics info on packets dropped forwarded received count needs to be updated by the slave process and the master needs to see the update and print them out So it needs to allocate a heap buffer using rte zmalloc In addition if the f option is specified an array
189. sage crosses the low watermark value The pipeline stage function s task is to process and move packets from the preceding pipeline stage This thread is running on most of the logical cores to create and arbitrarily long pipeline lcore id rte lcore id previous lcore id get previous lcore id lcore id for port id 0 port id lt RTE MAX ETHPORTS port id if tx rx if is bit set port id portmask continue rings lcore id port id rings previous lcore id port id ring state port id RING READY if rte ring count tx low watermark continue else ring state port id RING READY Dequeue up to quota mbuf from rx nb dq pkts rte ring dequeue burst rx pkts quota if unlikely nb dq pkts 0 continue Enqueue them on tx ret rte ring enqueue bulk tx pkts nb dq pkts if ret EDQUOT ring state port id RING OVERLOADED The thread s logic works mostly like receive stage except that packets are moved from ring to ring instead of port to ring In this example no actual processing is done on the packets but pipeline stage is an ideal place to perform any processing required by the application June 2014 Intel Data Plane Development Kit Intel DPDK Sample Applications User Guide Document Number 328218 008 129 intel 20 4 2 20 4 2 1 20 4 2 2 Quota and
190. se of the Intel DPDK to create a path for packets to go through the Linux kernel This is done by creating one or more kernel net devices for each of the Intel DPDK ports The application allows the use of standard Linux tools ethtool ifconfig tcpdump with the Intel DPDK ports and also the exchange of packets between the Intel DPDK application and the Linux kernel Overview The Kernel NIC Interface sample application uses two threads in user space for each physical NIC port being used and allocates one or more KNI device for each physical NIC port with kernel module s support For a physical NIC port one thread reads from the port and writes to KNI devices and another thread reads from KNI devices and writes the data unmodified to the physical NIC port It is recommended to configure one KNI device for each physical NIC port If configured with more than one KNI devices for a physical NIC port it is just for performance testing or it can work together with VMDq support in future The packet flow through the Kernel NIC Interface application is as shown in the following figure Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 38 Document Number 328218 008 Kernel NIC Interface Sample Application i n tel Figure 2 Kernel NIC Application Packet Flow 8 2 8 3 June 2014 KNI Sample Application linux Kernel Traffic Generator Compiling the Application Co
191. se of the cryptographic operations provided by the Intel QuickAssist Technology from within the Intel DPDK environment Therefore building and running this application requires having both the Intel DPDK and the QuickAssist Technology Software Library installed as well as at least one Intel QuickAssist Technology hardware device present in the system For this sample application there is a dependency on either of e Intel Communications Chipset 8900 to 8920 Series Software for Linux package e Intel Communications Chipset 8925 to 8955 Series Software for Linux package 19 1 Overview An overview of the application is provided in Figure 12 For simplicity only two NIC ports and one Intel QuickAssist Technology device are shown in this diagram although the number of NIC ports and Intel QuickAssist Technology devices can be different Figure 12 Intel QuickAssist Technology Application Block Diagram gt Nic gt Software gt NIC RXO Thread TXO x sr TX1 gt NIC Ra Device with Intel QuickAssist Accelerators Note Lines in blue show the packet flow for Software Thread 0 and lines in red show the packet flow for Software Thread N 1 The application allows the configuration of the following items e Number of NIC ports Intel Data Plane Development Kit Intel DPDK June 2014 Sample Applications User Guide Document Number 328218 008 117 i
192. ss of the port are accepted e config port queue lcore port queue lcore determines which queues from which ports are mapped to which cores e rule ipv4 FILENAME Specifies the IPv4 ACL and route rules file e rule ipv6 FILENAME Specifies the IPv6 ACL and route rules file e scalar Use a scalar function to perform rule lookup e enable jumbo optional enables jumbo frames e max pkt len optional maximum packet length in decimal 64 9600 e no numa optional disables numa awareness As an example consider a dual processor socket platform where cores 0 2 4 6 8 and 10 appear on socket 0 while cores 1 3 5 7 9 and 11 appear on socket 1 Let s say that the user wants to use memory from both NUMA nodes the platform has only two ports and the user wants to use two cores from each processor socket to do the packet processing Intel Data Plane Development Kit Intel DPDK Sample Applications User Guide Document Number 328218 008 75 i n tel L3 Forwarding with Access Control Sample Application 12 4 12 4 1 To enable L3 forwarding between two ports using two cores from each processor while also taking advantage of local memory access by optimizing around NUMA the user must enable two queues from each port pin to the appropriate cores and allocate memory from the appropriate NUMA node This is achieved using the following command build l3fwd acl c f n 4 p 0x3 config 0 0 0 0 1
193. t The packet flow through the exception path application is as shown in the following figure Figure 1 Packet Flow 8VHU VSDFH 3 HUQHO VSDFH DSSOLFDWLRQ ELLGILQUIRUZDUGLO amp RUHSO WDS 0 WDSS 1 WDSS 1 TUDILF HOHUDWRU amp RUH O 3RUW1 WDS 1 To make throughput measurements kernel bridges must be setup to forward data between the bridges appropriately Intel Data Plane Development Kit Intel DPDK June 2014 Sample Applications User Guide Document Number 328218 008 15 i n tel Exception Path Sample Application 3 2 3 3 3 3 1 Compiling the Application 1 Go to example directory export RTE SDK path to rte sdk cd RTE_SDK examples exception path 2 Setthe target a default target will be used if not specified For example export RTE TARGET x86 64 native linuxapp gcc This application is intended as a linuxapp only See the Inte DPDK Getting Started Guide for possible RTE TARGET values 3 Build the application make Running the Application The application requires a number of command line options build exception path EAL options p PORTMASK i IN CORES o OUT CORES where e p PORTMASK A hex bitmask of ports to use e i IN CORES A hex bitmask of cores which read from NIC e o OUT CORES A hex bitmask of cores which write to NIC Refer to the Intel DPDK Getting Started Guide for general information on runni
194. t power idle heuristic WN rx queue zero rx packet count lcore rx idle count else rx ring length rte eth rx queue count portid queueid rx queue zero rx packet count 0 do not scale up frequency immediately as user to kernel space communication is costly which might impact packet I O for received packets rx queue freq up hint power freq scaleup heuristic lcore id rx ring length Prefetch and forward packets Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 68 Document Number 328218 008 L3 Forwarding with Power Management Sample Application n tel if likely lcore rx idle count qconf n rx queue for i 1 lcore scaleup hint qconf rx queue list 0 freq up hint i qconf n rx queue i rx queue amp qconf rx queue list i if rx queue freq up hint gt lcore scaleup hint lcore scaleup hint rx queue freq up hint if lcore scaleup hint FREQ HIGHEST rte power freq max lcore id else if lcore scaleup hint FREQ HIGHER rte power freq up lcore id else All Rx queues empty in recent consecutive polls Sleep in a conservative manner meaning sleep as less as possible a4 for i 1 lcore idle hint qconf rx queue list 0 idle hint i lt qconf n rx queue i rx queue amp qconf rx queue list i if rx queue
195. t handle the situation when different fragments of the same packet arrive through different RX queues Each table entry can hold information about packet consisting of up to RTE_LIBRTE_IP_FRAG_MAX_FRAGS fragments frag cycles rte get tsc hz MS PER S 1 MS PER S max flow ttl if qconf frag tbl queue rte ip frag tbl create max flow num IPV4 FRAG TBL BUCKET ENTRIES max flow num frag cycles Socket NULL RTE LOG ERR IP RSMBL ip frag tbl create u on lcore Su for queue u failed n max flow num lcore queue return 1 Mempools Initialization The reassembly application demands a lot of mbuf s to be allocated At any given time up to 2 max flow num RTE LIBRTE IP FRAG MAX FRAGS maximum number of mbufs per packet can be stored inside Fragment Table waiting for remaining fragments To keep mempool size under reasonable limits and to avoid situation when one RX queue can starve other queues each RX queue uses its own mempool nb mbuf RTE MAX max flow num 2UL MAX PKT BURST RTE LIBRTE IP FRAG MAX FRAGS nb mbuf port conf rxmode max rx pkt len BUF SIZE 1 BUF SIZE nb mbuf 2 ipv4 and ipv6 nb mbuf RTE TEST RX DESC DEFAULT RTE TEST TX DESC DEFAULT nb mbuf RTE MAX nb mbuf uint32 t NB MBUF 9 rte snprintf buf si
196. t is checked to see if it has a multicast destination address and if the routing table has any ports assigned to the destination address if IS IPV4 MCAST dest addr hash rte fbk hash lookup mcast hash dest addr lt 0 port mask hash amp enabled port mask 0 rte pktmbuf free m return Then the number of ports in the destination portmask is calculated with the help of the bitcnt function Get number of bits set static inline uint32 t bitcnt uint32 t v uint32 t n for n 0 v 0 v amp v 1 n return n This is done to determine which forwarding algorithm to use This is explained in more detail in the next section Thereafter a destination Ethernet address is constructed construct destination ethernet address dst eth addr ETHER ADDR FOR IPV4 MCAST dest addr Since Ethernet addresses are also part of the multicast process each outgoing packet carries the same destination Ethernet address The destination Ethernet address is constructed from the lower 23 bits of the multicast group ORed with the Ethernet address 01 00 5e 00 00 00 as per RFC 1112 define ETHER ADDR FOR IPV4 MCAST x rte cpu to be 64 0x01005e000000ULL x amp Ox7fffff gt gt 16 Intel Data Plane Development Kit Intel DPDK June 2014 Sample Applications User Guide Document Number 328218 008 29 intel 6 4 4 IPv4 Multicast Sample Appl
197. te tiere etant erm Re Rad o peme SE NR RREACER REA aE 139 22 1 Background ie patre ttates al bea EH UR oe RC UR CHAR WE RR ERR CREE aa RR EA 139 22 2 Sample Code Overview ueeeeeesse sisse nn nn a n heh a uu RA RR XR ERA YR R ARR ERR 140 22 3 Supported Distributions ici eii neret nnne p hentai de Ege a ka ERR Ra Rn 142 22 4 Prerequisites eese rgia a tee eine nave ab Dot rosea tauren emanandame aede S der aeos 142 Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 6 Document Number 328218 008 Introduction 23 24 25 Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Figure 7 Figure 8 Figure 9 Figure 10 Figure 11 June 2014 intel 22 4 1 Installing Packages on the HOSt ccceeee cece cece ee este eeeeeaeeeeeeeeas 142 22 4 2 Setting up the Execution EnvironMent cceeeeeeeeeeeeeeeeeeeeees 143 22 4 3 Setting up the Guest Execution Environment sssssss 144 22 5 Compiling the Sample Code sss nene 144 22 6 Running the Sample Code iiiiiesie epe tete td RR Ne i Ra RR RR Na d NR 145 22 01 PatratrieterSouiicsscee niente hunter trenta Durus bi Du Hde Runs mes RE CUE 146 22 7 Running the Virtual Machine QEMU sss emen 147 22 7 1 Redirecting QEMU to vhost net Sample Code sssss 148 22 7 2 M Mapping the Virtual Machine s Memory ssesesnemmI 148 22 7 3 QEMU
198. terface is not available when using this compatibility layer Porting Netmap Applications Porting Netmap applications typically involves two major steps e Changing the system calls to use their compat netmap library counterparts e Adding further Intel DPDK initialization code Since the compat netmap functions have the same signature as the usual 1ipc calls the change is in most cases trivial The usual Intel DPDK initialization code involving rte eal init and rte eal pci probe has to be added to the Netmap application in the same way it is used in all other Intel DPDK sample applications Please refer to the Intel DPDK Programmer s Guide Rel 1 4 EAR and example source code for details about initialization In addition of the regular Intel DPDK initialization code the ported application needs to call initialization functions for the compat netmap library namely rte netmap init and rte netmap init port These two initialization functions take compat netmap specific data structures as parameters struct rte netmap conf and struct rte netmap port conf Those structures fields are Netmap related and are self explanatory for developers familiar with Netmap They are defined in RTE SDK examples netmap compat lib compat netmap h The bridge application is an example largely based on the bridge example shipped with the Netmap distribution It shows how a minimal Netmap application with minimal and straightforward source
199. terrupt feature enabled Intel Data Plane Development Kit Intel DPDK Sample Applications User Guide Document Number 328218 008 83 i n tel Link Status Interrupt Sample Application Configuring 1sc to O the default disables the generation of any link status change interrupts in kernel space and no user space interrupt event is received The public interface rte eth link get accesses the NIC registers directly to update the link status Configuring Isc to non zero enables the generation of link status change interrupts in kernel space when a link status change is present and calls the user space callbacks registered by the application The public interface rte eth link get just reads the link status in a global structure that would be updated in the interrupt host thread only 14 4 4 Interrupt Callback Registration The application can register one or more callbacks to a specific port and interrupt event An example callback function that has been written as indicated below static void lsi event callback uint8 t port id enum rte eth event type type void param struct rte eth link link RTE SET USED param printf n nIn registered callback n printf Event type s n type RTE ETH EVENT INTR LSC LSC interrupt unknown event rte eth link get nowait port id amp link if link link status printf Port d Link Up speed u Mbps s n n port id unsigned link link speed link link
200. the port statics periodically If an event indicating that a slave process exited is detected it sends the STOP command to the peer and waits until it has also exited Then it tries to clean up the execution environment and prepare new resources Finally the new slave instance is launched while 1 sleep 1 cur tsc rte rdtsc diff tsc cur tsc prev tsc if timer is enabled if timer period gt 0 advance the timer timer tsc diff tsc if timer has reached its timeout if unlikely timer tsc gt uint64 t timer period print stats reset the timer timer tsc 0 prev tsc cur tsc Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 106 Document Number 328218 008 Multi process Sample Application i n te I l Check any slave need restart or recreate rte spinlock lock amp res lock for i 0 i lt RTE MAX LCORE i struct lcore resource struct res amp lcore resource i struct lcore resource struct pair amp lcore resource res pair id If find slave exited try to reset pair if res enabled amp amp res gt flags amp amp pair gt enabled if pair gt flags master sendcmd with ack pair lcore id CMD STOP rte spinlock unlock amp res lock sleep 1 rte spinlock lock amp res lock if pair gt flags continue if reset pair res lcore id pair gt lcore id
201. the simple ipv4 fwd 4pkts function 10 4 4 Packet Forwarding for LPM based Lookups For each input packet the packet forwarding operation is done by the 13fwd simple forward function but the packet forwarding decision that is the identification of the output interface for the packet for LPM based lookups is done by the get ipv4 dst port function below static inline uint8 t get ipv4 dst port struct ipv4 hdr ipv4 hdr uint8 t portid lookup struct t ipv4 13fwd lookup struct uint8 t next_hop return uint8 t rte lpm lookup ipv4 13fwd lookup struct rte be to cpu 32 ipv4 hdr dst addr amp next hop 0 next hop portid Intel Data Plane Development Kit Intel DPDK June 2014 Sample Applications User Guide Document Number 328218 008 63 i n tel L3 Forwarding with Power Management Sample Application 11 L3 Forwarding with Power Management Sample Application 11 1 Introduction The L3 Forwarding with Power Management application is an example of power aware packet processing using the Intel DPDK The application is based on existing L3 Forwarding sample application with the power management algorithms to control the P states and C states of the Intel processor via a power management library 11 2 Overview The application demonstrates the use of the Power libraries in the Intel DPDK to implement packet forwarding The initialization and run time paths are very similar to those of the L3
202. tic variables For multiple thread cases all global and static variables have only one copy and they can be accessed by any thread if applicable So they can be used to sync or share data among threads In the previous examples each process has separate global and static variables in memory and are independent of each other If it is necessary to share the knowledge some communication mechanism should be deployed such as memzone ring shared memory and so on The global or static variables are not a valid approach to share data among processes For variables in this example on the one hand the slave process inherits all the knowledge of these variables after being created by the master On the other hand other processes cannot know if one or more processes modifies them after slave creation since that is the nature of a multiple process address space But this does not mean that these variables cannot be used to share or sync data it depends on the use case The following are the possible use cases 1 The master process starts and initializes a variable and it will never be changed after slave processes created This case is OK 2 After the slave processes are created the master or slave cores need to change a variable but other processes do not need to know the change This case is also OK 3 After the slave processes are created the master or a slave needs to change a variable In the meantime one or more other process needs
203. tion SR IOV features in a virtualized environment Note Please note that previously a separate L2 Forwarding in Virtualized Environments sample application was used however in later Intel DPDK versions these sample applications have been merged Overview The L2 Forwarding sample application which can operate in real and virtualized environments performs L2 forwarding for each packet that is received on an RX PORT The destination port is the adjacent port from the enabled portmask that is if the first four ports are enabled portmask Oxf ports 1 and 2 forward into each other and ports 3 and 4 forward into each other Also the MAC addresses are affected as follows e The source MAC address is replaced by the TX PORT MAC address e The destination MAC address is replaced by 02 00 00 00 00 TX PORT ID This application can be used to benchmark performance using a traffic generator as shown in the Figure 3 The application can also be used in a virtualized environment as shown in Figure 4 The L2 Forwarding application can also be used as a starting point for developing a new application based on the Intel DPDK Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 48 Document Number 328218 008 L2 Forwarding Sample Application in Real and Virtualized Environments n tel Figure 3 Performance Benchmark Setup Basic Environment Figure 4 Performance Benchmark Setup Virtualized E
204. to be aware of the change In this case global and static variables cannot be used to share knowledge Another communication mechanism is needed A simple approach without lock protection can be a heap buffer allocated by rte malloc or memzone 16 1 5 2 Slave Process Recovery Mechanism Before talking about the recovery mechanism it is necessary to know what is needed before a new slave instance can run if a previous one exited When a slave process exits the system returns all the resources allocated for this process automatically However this does not include the resources that were allocated by the Intel DPDK All the hardware resources are shared among the processes which include memzone mempool ring a heap buffer allocated by the rte malloc library and so on If the new instance runs and the allocated resource is not returned either resource allocation failed or the hardware resource is lost forever Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 102 Document Number 328218 008 Multi process Sample Application i n e When a slave process runs it may have dependencies on other processes They could have execution sequence orders they could share the ring to communicate they could share the same port for reception and forwarding they could use lock structures to do exclusive access in some critical path What happens to the dependent process es if the peer leaves The consequence a
205. ts for details In addition the TAP interfaces must also be created A TAP interface is created for each Icore that is being used The code for creating the TAP interface is as follows Create a tap network interface or use existing one with same name If name 0 2 N0O then a name is automatically assigned and returned in name ay static int tap create char name struct ifreq ifr int fd ret fd open dev net tun O RDWR if fd 0 return fd memset amp ifr 0 sizeof ifr TAP device without packet information ifr ifr flags IFF TAP IFF NO PI if name amp amp name rte snprinf ifr ifr name IFNAMSIZ name ret ioctl fd TUNSETIFF void amp ifr if ret lt 0 close fd return ret if name rte_snprintf name IFNAMSIZ ifr ifr name return fd The other step in the initialization process that is unique to this sample application is the association of each port with two cores e One core to read from the port and write to a TAP interface e A second core to read from a TAP interface and write to the port This is done using an array called port ids which is indexed by the Icore IDs The population of this array is shown below tx port 0 rx port 0 Intel Data Plane Development Kit Intel DPDK June 2014 Sample Applications User Guide Document Number 328218 008 17 3 4 2 Exception Path Sample Application RTE LCORE
206. tx 8 6 3 Callbacks for Kernel Requests To execute specific PMD operations in user space requested by some Linux commands callbacks must be implemented and filled in the struct rte kni ops structure Currently setting a new MTU and configuring the network interface up down are supported static struct rte kni ops kni ops change mtu kni change mtu config network if kni config network interface be Callback for request of changing MTU static int kni change mtu uint8 t port id unsigned new mtu int ret struct rte eth conf conf if port id gt rte eth dev count Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 46 Document Number 328218 008 Kernel NIC Interface Sample Application intel RTE LOG ERR APP Invalid port id d n port id return EINVAL RTE LOG INFO APP Change MTU of port d to u n port id new mtu Stop specific port rte eth dev stop port id memcpy amp conf amp port conf sizeof conf Set new MTU if new mtu ETHER MAX LEN conf rxmode jumbo frame 1 else conf rxmode jumbo frame 0 mtu length of header length of FCS max pkt length conf rxmode max rx pkt len new mtu KNI ENET HEADER SIZE KNI ENET FCS SIZE ret rt th dev configure port id 1 1 amp conf if ret lt 0 RTE LOG ERR APP Fail to reconfigure
207. ugepagesz 2M hugepages 256 default hugepagesz 1G Update the grub bootloader grub2 mkconfig o boot grub2 grub cfg Reboot the system Note Ensure that the default hugepage size after this setup is 1 GB 23 4 3 Setting up the Guest Execution Environment It is recommended for testing purposes that the Intel DPDK testpmd sample application is used in the guest to forward packets the reasons for this are discussed in Section 23 7 Running the Virtual Machine QEMU on page 147 The testpmd application forwards packets between pairs of Ethernet devices it requires an even number of Ethernet devices virtio or otherwise to execute It is therefore recommended to create multiples of two virtio net devices for each Virtual Machine either through libvirt or at the command line as follows Note Observe that in the example device and netdev are repeated for two virtio net devices user target qemu system x86 64 netdev tap id hostnetl vhost on vhostfd open fd device virtio net pci netdev hostnetl id netl netdev tap id hostnet2 vhost on vhostfd open fd device virtio net pci netdev hostnet2 id netl 23 5 Compiling the Sample Code 1 Go to the examples directory export RTE SDK path to rte sdk cd S RTE SDK examples vhost net Set the target a default target is used if not specified For example export RTE TARGET x86 64 native linuxapp gcc See the I
208. umed to be saved in the same file The application parses the rules from the file and adds them to the database by calling the ACL library function It ignores empty and comment lines and parses and validates the rules it reads If errors are detected the application exits with messages to identify the errors encountered Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 76 Document Number 328218 008 L3 Forwarding with Access Control Sample Application n tel 12 4 2 June 2014 The application needs to consider the userdata and priority fields The ACL rules save the index to the specific rules in the userdata field while route rules save the forwarding port number In order to differentiate the two types of rules ACL rules add a signature in the userdata field As for the priority field the application assumes rules are organized in descending order of priority Therefore the code only decreases the priority number with each rule it parses Setting Up the ACL Context For each supported AC rule format IPv4 5 tuple IPv6 6 tuple application creates a separate context handler from the ACL library for each CPU socket on the board and adds parsed rules into that context Note that for each supported rule type application needs to calculate the expected offset of the fields from the start of the packet That s why only packets with fixed IPv4 IPv6 header are supported That allows to per
209. v6 packets For example consider a dual processor socket platform where cores 0 7 and 16 23 appear on socket 0 while cores 8 15 and 24 31 appear on socket 1 Let s say that the programmer wants to use memory from both NUMA nodes the platform has only two ports one connected to each NUMA node and the programmer wants to use two cores from each processor socket to do the packet processing To enable L3 forwarding between two ports using two cores cores 1 and 2 from each processor while also taking advantage of local memory access by optimizing around NUMA the programmer must enable two queues from each port pin to the appropriate cores and allocate memory from the appropriate NUMA node This is achieved using the following command build l3fwd c 606 n 4 p 0x3 config 0 0 1 0 1 2 1 0 9 1 1 10 In this command June 2014 Intel Data Plane Development Kit Intel DPDK Sample Applications User Guide Document Number 328218 008 59 L3 Forwarding Sample Application e The c option enables cores 0 1 2 3 e The p option enables ports 0 and 1 e The config option enables two queues on each port and maps each port queue 10 4 10 4 1 Note table d pair to a specific core Logic to enable multiple RX queues using RSS and to allocate memory from the correct NUMA nodes is included in the application and is done transparently The following table shows the mapping in this example
210. validated with the following distributions e Fedora 18 e Fedora 19 23 4 Prerequisites This section lists prerequisite packages that must be installed 23 4 1 Installing Packages on the Host The vhost sample code uses the following packages fuse fuse devel and kernel modules extra 1 Install Fuse Development Libraries and headers yum y install fuse fuse devel Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 142 Document Number 328218 008 Vhost Sample Application i n Le 23 4 2 2 Install the Cuse Kernel Module yum y install kernel modules extra Setting up the Execution Environment The vhost sample code requires that QEMU allocates a VM s memory on the hugetlbfs file system As the vhost sample code requires hugepages the best practice is to partition the system into separate hugepage mount points for the VMs and the vhost sample code Note This is best practice only and is not mandatory For systems that only support 2 MB page sizes both QEMU and vhost sample code can use the same hugetlbfs mount point without issue QEMU VMs with gigabytes of memory can benefit from having QEMU allocate their memory from 1 GB huge pages 1 GB huge pages must be allocated at boot time by passing kernel parameters through the grub boot loader 1 Calculate the maximum memory usage of all VMs to be run on the system Then round this value up to the nearest Gigabyte the e
211. ve the same value for the load balancing field this scheme also ensures that all the packets that are part of the same traffic flow are directed to the same worker Icore flow affinity in the same order they enter the system packet ordering 15 1 2 I O TX Logical Cores Each I O Icore owns the packet TX for a predefined set of NIC ports To enable each worker thread to send packets to any NIC TX port the application creates a software ring for each worker Icore NIC TX port pair with each I O TX core handling those software rings that are associated with NIC ports that it handles 15 1 3 Worker Logical Cores Each worker Icore reads packets from its set of input software rings and routes them to the NIC ports for transmission by dispatching them to output software rings The routing logic is LPM based with all the worker threads sharing the same LPM rules 15 2 Compiling the Application The sequence of steps used to build the application is 1 Export the required environment variables export RTE SDK Path to the Intel DPDK installation folder export RTE TARGET x86 64 native linuxapp gcc 2 Build the application executable file cd S RTE SDK examples load balancer make For more details on how to build the Intel DPDK libraries and sample applications please refer to the Intel DPDK Getting Started Guide Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 90 Document
212. write burst size to the NIC TX the default value is 64 e msz M Mempool size in number of mbufs for each pfc default 2097152 e rth A B C The RX queue threshold parameters A RX prefetch threshold the default value is 8 B RX host threshold the default value is 8 C RX write back threshold the default value is 4 e tth A B C TX queue threshold parameters A TX prefetch threshold the default value is 36 B TX host threshold the default value is 0 Cz TX write back threshold the default value is 0 e cfg FILE Profile configuration to load Refer to Intel DPDK Getting Started Guide for general information on running applications and the Environment Abstraction Layer EAL options The profile configuration file defines all the port subport pipe traffic class queue parameters needed for the QoS scheduler configuration The profile file has the following format port configuration port frame overhead 24 number of subports per port 1 number of pipes per subport 4096 queue sizes 64 64 64 64 Subport configuration subport 0 tb rate 1250000000 Bytes per second tb size 1000000 Bytes tc 0 rate 1250000000 Bytes per second tc 1 rate 1250000000 Bytes per second tc 2 rate 1250000000 Bytes per second tc 3 rate 1250000000 Bytes per second tc period 10 Milliseconds tc oversubscription period 10 Milliseconds pipe 0 4095 0 These pipes
213. x i struct rte mbuf m pkts burst i int ret write tap fd rte pktmbuf mtod m void rte pktmbuf data len m rte pktmbuf free m if unlikely ret 0 lcore stats lcore id dropped else lcore stats lcore id tx t For the other case that reads from a TAP interface and writes to a NIC port packets are retrieved by doing a read from the file descriptor of the appropriate TAP interface This fills in the data into the mbuf then other fields are set manually The packet can then be transmitted as normal Intel Data Plane Development Kit Intel DPDK Sample Application User Guide 18 June 2014 Document Number 328218 008 Exception Path Sample Application n tel Loop forever reading from tap and writing to NIC for int ret struct rte mbuf m rte pktmbuf alloc pktmbuf pool if m NULL continue ret read tap fd m gt pkt data MAX PACKET S2 lcore stats lcore id rxt if unlikely ret lt 0 FATAL ERROR Reading from s interface failed tap name m pkt nb segs 1 m gt pkt next NULL m pkt pkt len uint16 t ret m pkt data len uintl6 t ret ret rte eth tx burst port ids lcore id 0 amp m 1 if unlikely ret 1 rte pktmuf free m lcore stats lcore id dropped else lcore stats lcore id tx To set up loops for measuring throughput TAP interfaces can be connected using bridging The ste
214. x lcore tx lcore kthread Where e P Set all ports to promiscuous mode so that packets are accepted regardless of the packet s Ethernet MAC destination address Without this option only packets with the Ethernet MAC destination address set to the Ethernet address of the port are accepted e p PORTMASK Hexadecimal bitmask of ports to configure Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 40 Document Number 328218 008 Kernel NIC Interface Sample Application i n tel 8 5 8 6 June 2014 config port lcore rx lcore tx lcore kthrea port lcore rx lcore tx lcore kthread Determines which Icores of RX TX kernel thread are mapped to which ports Refer to Intel DPDK Getting Started Guide for general information on running applications and the Environment Abstraction Layer EAL options The c coremask parameter of the EAL options should include the Icores indicated by the 1core rx and lcore tx but does not need to include Icores indicated by lcore kthread as they are used to pin the kernel thread on The p PORTMASK parameter should include the ports indicated by the port in config neither more nor less The 1core kthread in config can be configured none one or more Icore IDs In multiple kernel thread mode if configured none a KNI device will be allocated for each port while no specific Icore affinity will be set for its kernel th
215. xample Data Flow in a Client Server Symmetric Multi process Application Hardware Queues ClientO Client1 16 1 4 1 Running the Application The server process must be run initially as the primary process to set up all memory structures for use by the clients In addition to the EAL parameters the application specific parameters are e p lt portmask gt where portmask is a hexadecimal bitmask of what ports on the system are to be used For example p 3 to use ports 0 and 1 only e n lt num clients gt where the num clients parameter is the number of client processes that will process the packets received by the server application Intel Data Plane Development Kit Intel DPDK June 2014 Sample Applications User Guide Document Number 328218 008 99 n tel Multi process Sample Application Note In the server process a single thread the master thread that is the lowest numbered Icore in the coremask performs all packet I O If a coremask is specified with more than a single Icore bit set in it an additional Icore will be used for a thread to periodically print packet count statistics Since the server application stores configuration data in shared memory including the network ports to be used the only application parameter needed by a client process is its client instance ID Therefore to run a server application on Icore 1 with Icore 2 printing statistics along with two client processes running on Icores
216. xamples link status interrupt 2 Setthe target a default target is used if not specified For example export RTE TARGET x86 64 native linuxapp gcc See the Intel DPDK Getting Started Guide for possible RTE TARGET values 3 Build the application make Note The compiled application is written to the build subdirectory To have the application written to a different location the 0 path to build directory option may be specified on the make command line Intel Data Plane Development Kit Intel DPDK June 2014 Sample Applications User Guide Document Number 328218 008 81 i n tel Link Status Interrupt Sample Application 14 3 Running the Application The application requires a number of command line options build link status interrupt EAL options p PORTMASK q NQ T PERIOD where e p PORTMASK A hexadecimal bitmask of the ports to configure e q NQ A number of queues ports per Icore default is 1 e T PERIOD statistics will be refreshed each PERIOD seconds 0 to disable 10 default To run the application in a linuxapp environment with 4 Icores 4 memory channels 16 ports and 8 RX queues per Icore issue the command build link status interrupt c f n 4 q 8 p ffff Refer to the Intel DPDK Getting Started Guide for general information on running applications and the Environment Abstraction Layer EAL options 14 4 Explanation The following sections provide some e
217. xecution environment will require 2 Edit the etc default grub file and add the following to the GRUB CMDLINE LINUX entry GRUB CMDLINE LINUX hugepagesz 1G hugepages Number of hugepages required default hugepagesz 1G 3 Update the grub boot loader grub2 mkconfig o boot grub2 grub cfg 4 Reboot the system 5 The hugetlbfs mount point dev hugepages should now default to allocating gigabyte pages Note Making the above modification will change the system default hugepage size to 1 GB for all applications June 2014 Vhost Sample Code In this section we create a second hugetlbs mount point to allocate hugepages for the Intel DPDK vhost sample code 1 Allocate sufficient 2 MB pages for the Intel DPDK vhost sample code echo 256 sys kernel mm hugepages hugepages 2048kB nr hugepages 2 Mount hugetlbs at a separate mount point for 2 MB pages mount t hugetlbfs nodev mnt hug O pagesize 2M Intel Data Plane Development Kit Intel DPDK Sample Applications User Guide Document Number 328218 008 143 intel Vhost Sample Application The above steps can be automated by doing the following 1 4 Edit etc fstab to add an entry to automatically mount the second hugetlbfs mount point hugetlbfs tab mnt huge tab hugetlbfs defaults pagesize 1G 0 0 Edit the etc default grub file and add the following to the GRUB CMDLINE LINUX entry GRUB CMDLINE LINUX h
218. xplanation of the code 14 4 1 Command Line Arguments The Link Status Interrupt sample application takes specific parameters in addition to Environment Abstraction Layer EAL arguments see Section 14 3 Command line parsing is done in the same way as it is done in the L2 Forwarding Sample Application See Section 9 4 1 Command Line Arguments on page 51 for more information 14 4 2 Mbuf Pool Initialization Mbuf pool initialization is done in the same way as it is done in the L2 Forwarding Sample Application See Section 9 4 2 Mbuf Pool Initialization on page 51 for more information 14 4 3 Driver Initialization The main part of the code in the main function relates to the initialization of the driver To fully understand this code it is recommended to study the chapters that related to the Poll Mode Driver in the Intel DPDK Programmer s Guide and the Intel DPDK API Reference if rte eal pci probe 0 rte exit EXIT FAILURE Cannot probe PCI n Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 82 Document Number 328218 008 Link Status Interrupt Sample Application n t el June 2014 nb ports rte eth dev count if nb ports 0 rte exit EXIT FAILURE No Ethernet ports bye n if nb ports RTE MAX ETHPORTS nb ports RTE MAX ETHPORTS Each logical core is assigned a dedicated TX queue on each port
219. zed on that core pipeline stage on core X dequeues packets from core X 1 s rings and enqueue them on its own rings See Figure 15 Start pipeline stage on all the available slave lcore but the last for lcore id 0 lcore id lt last lcore id lcore id if rte lcore is enabled lcore id amp amp lcore id master lcore id for port id 0 port id lt RTE MAX ETHPORTS port idt if is bit set port id portmask init ring lcore id port id rte eal remote launch pipeline stage NULL lcore id The last available logical core runs send_stage which is the last stage of the pipeline dequeuing packets from the last ring in the pipeline and sending them out on the destination port setup by pair ports Start send stage on the last slave core rte eal remote launch send stage NULL last lcore id Receive Process and Transmit Packets Figure 15 shows where each thread in the pipeline is It should be used as a reference while reading the rest of this section Intel Data Plane Development Kit Intel DPDK Sample Applications User Guide Document Number 328218 008 127 i n tel Quota and Watermark Sample Application Figure 15 Threads and Pipelines Pipeline stage on logical send stage on logical cere 1 ere N E E E T send stage on master logical ox pipeline stage on logical cre N 1 In the receive stage function running on the master l
220. zeof buf mbuf pool u u lcore queue if rxq pool rte mempool create buf nb mbuf MBUF SIZE 0 sizeof struct rte pktmbuf pool private rte pktmbuf pool init NULL rte pktmbuf init NULL socket MEMPOOL F SP PUT MEMPOOL F SC GET NULL RTE LOG ERR IP RSMBL mempool create s failed buf return 1 Packet Reassembly and Forwarding For each input packet the packet forwarding operation is done by the 13fwd simple forward function If the packet is an IPv4 or IPv6 fragment then it Intel Data Plane Development Kit Intel DPDK Sample Application User Guide June 2014 36 Document Number 328218 008 IP Reassembly Sample Application i n ke 7 4 4 June 2014 calls cte ipv4 reassemble packet for IPv4 packets or rte ipv6 reassemble packet for IPv6 packets These functions either return a pointer to valid mbuf that contains reassembled packet or NULL if the packet can t be reassembled for some reason Then 3 wd simple forward continues with the code for the packet forwarding decision that is the identification of the output interface for the packet and actual transmit of the packet The rte ipv4 reassemble packet Or rte ipv6 reassemble packet are responsible for 1 Searching the Fragment Table for entry with packet s IP Source Address IP Destination Address Packet ID 2 Ifthe entry is found then check if that entry alrea
Download Pdf Manuals
Related Search
Related Contents
Manual Nexus Bedrijfsinstructie Consignes Betriebsanweisung Operating instructions www.martela.com User manual Käyttöohje Bruksanvisning GPS Vehicle Tracker Betriebsanleitung Archivobjekt öffnen Otterbox MOT2-ATRIX-20-E4OTR mobile phone case Homeocoksinum casse-grippe Copyright © All rights reserved.
Failed to retrieve file