This solution tries to hw-offload flows in the flow classification cache. On success the NIC does the classification and hints the packets with a flow id, so that it can direct the packet directly to output target port, and update the flow cache statistics, either retrieved from NIC or sw-handled. If the hw-offload fails normal OVS SW handling applies. The flow cache contains match and action structures that is used to generate a flow classification offload request to device, using the flow specification and the wildcard bitmask from the match structure (megaflow). This is done on add, modify and delete of flow cache elements in the dpif-netdev userspace module. The actual flow offload request to the device is done by adding two netdev_class functions "hw_flow_offload" and "get_flow_stats". In netdev-dpdk, these functions are then implemented using the DPDK RTE FLOW API.

学习地址: Dpdk/网络协议栈/vpp/OvS/DDos/NFV/虚拟化/高性能专家-学习视频教程-腾讯课堂
更多DPDK相关学习资料有需要的可以自行报名学习,免费订阅,久学习,或点击这里加qun免费
领取,关注我持续更新哦! !
How to enable hw offload: Using OVSDB and add hw-port-id to interface options: # ovs-vsctl set interface dpdk0 options:hw-port-id=0 Implementation details (hopefully not too messy - diff can be retried from github: https://github.com/napatech/ovs/commit/467f076835143a9d0c17ea514d4e5c0c33d72c98.diff): 1) OVSDB option hw-port-id is used to enable hw-offload and maps the port interface to the hw port id. If a dpdkport, where the X does not match the hw-port-id, a in-port remap is requested using RTE FLOW ITEM PORT. This enables for virtual ports in NIC. This is needed to do south, east and west flow classification in NIC. Note: not supported on all NICs supporting RTE FLOW. dpif-netdev.c ============= static odp_port_t hw_local_port_map[MAX_HW_PORT_CNT]; This array keeps a mapping between all created ports in dpif-netdev which has associated a hw-port-id, and the dpif selected odp port id. This is used to map port id used in flows to hw-port-id used in NIC. in do_add_port(): if (netdev_is_pmd(port->netdev)) { struct smap netdev_args; smap_init(&netdev_args); netdev_get_config(port->netdev, &netdev_args); port->hw_port_id = smap_get_int(&netdev_args, "hw-port-id", (int)(MAX_HW_PORT_CNT + 1)); smap_destroy(&netdev_args); VLOG_DBG("ADD PORT (%s) has hw-port-id %i, odp_port_id %i\n", devname, port->hw_port_id, port_no); if (port->hw_port_id <= MAX_HW_PORT_CNT) { hw_local_port_map[port->hw_port_id] = port_no; /* Inform back to netdev driver the actual selected ODP port number */ smap_init(&netdev_args); smap_add_format(&netdev_args, "odp_port_no", "%lu", (long unsigned int)port_no); netdev_set_config(port->netdev, &netdev_args, NULL); smap_destroy(&netdev_args); } } else { port->hw_port_id = (int)(MAX_HW_PORT_CNT + 1); } In do_add_port, the netdev_get_config() is called to retreive a hw-port-id if specified. When specified, the odp-port-id is then send back to netdev port. static void dp_netdev_try_hw_offload(struct dp_netdev_port *port, struct match *match, const ovs_u128 *ufid, const struct nlattr *actions, size_t actions_len) { if (port->hw_port_id < MAX_HW_PORT_CNT) { VLOG_INFO("FLOW ADD (try match) -> Actions in_port %i (port name : %s, type : %s)\n", match->flow.in_port.ofp_port, xstrdup(netdev_get_name(port->netdev)), xstrdup(port->type)); netdev_try_hw_flow_offload(port->netdev, port->hw_port_id, match, ufid, actions, actions_len); } } This function is called to add a new flow to hw or modify an existing one. The netdev_hw_offload module holds a cache of all ufid's of flows that is offloaded successfully to hw. if dp_netdev_pmd_remove_flow: netdev_try_hw_flow_offload(port->netdev, -1, NULL, &flow->ufid, NULL, 0); Is called to remove a hw-offloaded flow. The netdev_hw_offload module will delete the flow in hw if ufid found in cache. In dp_netdev_process_rxq_port: if (port->hw_port_id < MAX_HW_PORT_CNT) { struct netdev_flow_stats *flow_stats = NULL; int i; time_t a = time_now(); if (port->sec < a) { if (!ovs_mutex_trylock(&stat_mutex)) { //printf("CHECK FOR STATS %08x for port %i\n", (long long unsigned)pmd, port->hw_port_id); port->sec = a; netdev_hw_get_stats_from_dev(rx, &flow_stats); if (flow_stats) { static struct dp_netdev_flow *flow; long long now = time_msec(); for (i = 0; i < flow_stats->num; i++) { if (flow_stats->flow_stat[i].packets) { /* Some statistics to update on from this flow */ VLOG_DBG("Update stats with pkts %lu, bytes %lu, errors %lu, flow-id %lu\n", flow_stats->flow_stat[i].packets, flow_stats->flow_stat[i].bytes, flow_stats->flow_stat[i].errors, (long unsigned int)flow_stats->flow_stat[i].flow_id); flow = dp_netdev_pmd_find_flow(pmd, &flow_stats->flow_stat[i].ufid, NULL, 0); if (flow) { dp_netdev_flow_used(flow, flow_stats->flow_stat[i].packets, flow_stats->flow_stat[i].bytes, 0, now); } } } netdev_hw_free_stats_from_dev(rx, &flow_stats); } ovs_mutex_unlock(&stat_mutex); } } } Added this section to poll for hw statistics each second, then update flow cache statistics. The netdev_hw_get_stats_from_dev is a function in the netdev_hw_offload module and retreives statistics from hw, or if not supported, read from flow cache. NOTE: or'ed tcp_flags are not supported yet. #define GET_ODP_OUT_PORT(id) (id & FLOW_ID_ODP_PORT_BIT)?\ id & FLOW_ID_PORT_MASK:((id & FLOW_ID_PORT_MASK) < FLOW_ID_PORT_MASK)?\ hw_local_port_map[id & HW_PORT_MASK]:OVS_BE32_MAX; if (port->hw_port_id < MAX_HW_PORT_CNT) { int i, ii; struct dp_packet_batch direct_batch; odp_port_t direct_odp_port = OVS_BE32_MAX; dp_packet_batch_init(&direct_batch); i = 0; while (i < batch.count) { uint32_t flow_id = dp_packet_get_pre_classified_flow_id(batch.packets[i]); if (flow_id != (uint32_t)-1) { odp_port_t odp_out_port = GET_ODP_OUT_PORT(flow_id); if (odp_out_port < OVS_BE32_MAX) { if (direct_batch.count && direct_odp_port != odp_out_port) { /* Check if SW flow statistics update in hw-offload is needed - only if hw cannot give flow stats */ netdev_update_flow_stats(rx, &direct_batch); _send_pre_classified_batch(pmd, direct_odp_port, &direct_batch); direct_batch.count = 0; } direct_odp_port = odp_out_port; direct_batch.packets[direct_batch.count++] = batch.packets[i]; for (ii = i+1; ii < batch.count; ii++) { batch.packets[ii-1] = batch.packets[ii]; } batch.count--; continue; } } i++; } if (direct_batch.count) { VLOG_DBG("Tx directly from Port (odp) %i to %i, num %i, left %i\n", port_no, direct_odp_port, direct_batch.count, batch.count); /* Check if SW flow statistics update in hw-offload is needed - only if hw cannot give flow stats */ netdev_update_flow_stats(rx, &direct_batch); _send_pre_classified_batch(pmd, direct_odp_port, &direct_batch); } if (!batch.count) return; } And adding this section to check for pre-classification id from NIC. If pre-classified, send it directly to destination. netdev_update_flow_stats is called to count statistics from batch, if NIC does not support flow statistics directly. netdev-dpdk.c ============= Need to include match.h to be able to read flow in hw_flow_offload function. #include "openvswitch/match.h" Add FDIR_MODE_PERFECT mode for Intel Flow Director configuration. .fdir_conf = { .mode = RTE_FDIR_MODE_PERFECT, }, The struct netdev_dpdk structure adds a few fields to keep track of hw-offload: /* added to handle hw offloading of * packet classification. */ bool hw_offload; int hw_port_id; uint16_t odp_port_no; if hw-port-id is configured in OVSDB, that id is reported to dpif-netdev add_port through netdev_dpdk_get_config and odp_port_no is set back through netdev_dpdk_set_config. In netdev_dpdk_set_config: int hw_port_id = smap_get_int(args, "hw-port-id", -1); uint32_t odp_port_no = smap_get_int(args, "odp_port_no", -1); ovs_mutex_lock(&dpdk_mutex); ovs_mutex_lock(&dev->mutex); if (hw_port_id != -1) dev->hw_port_id = hw_port_id; if ((odp_port_no < (uint32_t)-1)) { dev->odp_port_no = (uint16_t)odp_port_no; if (dev->hw_port_id != -1) { dev->hw_offload = true; VLOG_INFO("HW OFFLOAD ready on device (%s) : hw-port-id <-> odp-port-id %i <-> %i\n", netdev->name, dev->hw_port_id, dev->odp_port_no); } ovs_mute