NC364T nic lots of drop when traffic hit 250kpps rate

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
0
down vote

favorite












I have Server with 32 core CPU and it has NC364T Quad 1G ethernet and i have notice when packet rate hit 250kpps (200,000 packet per second) i start seeing intermediate drops in ifconfig output and in my system ksoftirqd/0 process start eating CPU because it stick to single CPU0 instead of spread load across all CPU.



[root@server1 ~]# ifconfig | grep drop
RX errors 0 dropped 3019497 overruns 0 frame 0
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0


This is what i did try to resolve issue.




  1. increase RX ring buffer size to max 4096



    ethtool -G enp9s0f0 rx 4096




  2. Try to adjust affinity but still no luck



    cat "ff" > /proc/irq/86/smp_affinity



But still i am seeing drops, here is the some output



[root@server1 ~]# ethtool -S enp9s0f0 | grep rx
rx_packets: 21351639073
rx_bytes: 2150885430317
rx_broadcast: 3520806
rx_multicast: 1054
rx_errors: 0
rx_length_errors: 0
rx_over_errors: 0
rx_crc_errors: 0
rx_frame_errors: 0
rx_no_buffer_count: 49483446
rx_missed_errors: 2997479
rx_long_length_errors: 0
rx_short_length_errors: 0
rx_align_errors: 0
rx_flow_control_xon: 0
rx_flow_control_xoff: 0
rx_csum_offload_good: 21089322623
rx_csum_offload_errors: 107538
rx_header_split: 0
alloc_rx_buff_failed: 0
rx_smbus: 0
rx_dma_failed: 0
rx_hwtstamp_cleared: 0


details setting



[root@server1 ~]# ethtool -k enp9s0f0
Features for enp9s0f0:
rx-checksumming: on
tx-checksumming: on
tx-checksum-ipv4: off [fixed]
tx-checksum-ip-generic: on
tx-checksum-ipv6: off [fixed]
tx-checksum-fcoe-crc: off [fixed]
tx-checksum-sctp: off [fixed]
scatter-gather: on
tx-scatter-gather: on
tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: on
tx-tcp-segmentation: on
tx-tcp-ecn-segmentation: off [fixed]
tx-tcp6-segmentation: on
udp-fragmentation-offload: off [fixed]
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off [fixed]
rx-vlan-offload: on
tx-vlan-offload: on
ntuple-filters: off [fixed]
receive-hashing: on
highdma: on [fixed]
rx-vlan-filter: on [fixed]
vlan-challenged: off [fixed]
tx-lockless: off [fixed]
netns-local: off [fixed]
tx-gso-robust: off [fixed]
tx-fcoe-segmentation: off [fixed]
tx-gre-segmentation: off [fixed]
tx-ipip-segmentation: off [fixed]
tx-sit-segmentation: off [fixed]
tx-udp_tnl-segmentation: off [fixed]
tx-mpls-segmentation: off [fixed]
fcoe-mtu: off [fixed]
tx-nocache-copy: off
loopback: off [fixed]
rx-fcs: off
rx-all: off
tx-vlan-stag-hw-insert: off [fixed]
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: off [fixed]
busy-poll: off [fixed]
tx-sctp-segmentation: off [fixed]
l2-fwd-offload: off [fixed]
hw-tc-offload: off [fixed]


Nic details



[root@server1 ~]# lspci | grep -i eth
09:00.0 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet Controller (Copper) (rev 06)
09:00.1 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet Controller (Copper) (rev 06)
0a:00.0 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet Controller (Copper) (rev 06)
0a:00.1 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet Controller (Copper) (rev 06)


It does support MSI too lspci -vv but it doesn't have MSI-X



09:00.0 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet Controller (Copper) (rev 06)
Subsystem: Hewlett-Packard Company NC364T PCI Express Quad Port Gigabit Server Adapter
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin B routed to IRQ 86
NUMA node: 0
Region 0: Memory at f7de0000 (32-bit, non-prefetchable) [size=128K]
Region 1: Memory at f7d00000 (32-bit, non-prefetchable) [size=512K]
Region 2: I/O ports at 6000 [size=32]
Capabilities: [c8] Power Management version 2
Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME-
Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
Address: 00000000fee00c18 Data: 0000
Capabilities: [e0] Express (v1) Endpoint, MSI 00
DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <512ns, L1 <64us
ExtTag- AttnBtn- AttnInd- PwrInd- RBE- FLReset- SlotPowerLimit 0.000W
DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ Unsupported+
RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
MaxPayload 256 bytes, MaxReadReq 4096 bytes
DevSta: CorrErr- UncorrErr+ FatalErr- UnsuppReq+ AuxPwr+ TransPend-
LnkCap: Port #2, Speed 2.5GT/s, Width x4, ASPM L0s, Exit Latency L0s <4us, L1 <64us
ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 2.5GT/s, Width x4, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
Capabilities: [100 v1] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-
UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
AERCap: First Error Pointer: 14, GenCap- CGenEn- ChkCap- ChkEn-
Capabilities: [140 v1] Device Serial Number 00-26-55-ff-ff-d2-1a-6c
Kernel driver in use: e1000e
Kernel modules: e1000e






share|improve this question






















  • Has that network a lot of Windows machines and/or a lot of broadcasts/multicasts? Are you using jumbo frames where you shouldnt? The NIC is only one part of the equation.
    – Rui F Ribeiro
    Jan 14 at 13:43











  • No i am not using jumbo frame, its and no other custom setting i did, i am running CentOS7, but or most of traffic is UDP streaming, if packet rate is low under 200kpps everything works but when it go high i noticing drops
    – Satish
    Jan 14 at 13:45











  • Are you sure it is 200kbps? It seems too low for that kind of problems.
    – Rui F Ribeiro
    Jan 14 at 13:49






  • 1




    I am saying 200,000 packet per second arrival rate of packets not a size of packet, my traffic is 200Mbps don't get confused in kbps it is kpps
    – Satish
    Jan 14 at 13:54














up vote
0
down vote

favorite












I have Server with 32 core CPU and it has NC364T Quad 1G ethernet and i have notice when packet rate hit 250kpps (200,000 packet per second) i start seeing intermediate drops in ifconfig output and in my system ksoftirqd/0 process start eating CPU because it stick to single CPU0 instead of spread load across all CPU.



[root@server1 ~]# ifconfig | grep drop
RX errors 0 dropped 3019497 overruns 0 frame 0
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0


This is what i did try to resolve issue.




  1. increase RX ring buffer size to max 4096



    ethtool -G enp9s0f0 rx 4096




  2. Try to adjust affinity but still no luck



    cat "ff" > /proc/irq/86/smp_affinity



But still i am seeing drops, here is the some output



[root@server1 ~]# ethtool -S enp9s0f0 | grep rx
rx_packets: 21351639073
rx_bytes: 2150885430317
rx_broadcast: 3520806
rx_multicast: 1054
rx_errors: 0
rx_length_errors: 0
rx_over_errors: 0
rx_crc_errors: 0
rx_frame_errors: 0
rx_no_buffer_count: 49483446
rx_missed_errors: 2997479
rx_long_length_errors: 0
rx_short_length_errors: 0
rx_align_errors: 0
rx_flow_control_xon: 0
rx_flow_control_xoff: 0
rx_csum_offload_good: 21089322623
rx_csum_offload_errors: 107538
rx_header_split: 0
alloc_rx_buff_failed: 0
rx_smbus: 0
rx_dma_failed: 0
rx_hwtstamp_cleared: 0


details setting



[root@server1 ~]# ethtool -k enp9s0f0
Features for enp9s0f0:
rx-checksumming: on
tx-checksumming: on
tx-checksum-ipv4: off [fixed]
tx-checksum-ip-generic: on
tx-checksum-ipv6: off [fixed]
tx-checksum-fcoe-crc: off [fixed]
tx-checksum-sctp: off [fixed]
scatter-gather: on
tx-scatter-gather: on
tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: on
tx-tcp-segmentation: on
tx-tcp-ecn-segmentation: off [fixed]
tx-tcp6-segmentation: on
udp-fragmentation-offload: off [fixed]
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off [fixed]
rx-vlan-offload: on
tx-vlan-offload: on
ntuple-filters: off [fixed]
receive-hashing: on
highdma: on [fixed]
rx-vlan-filter: on [fixed]
vlan-challenged: off [fixed]
tx-lockless: off [fixed]
netns-local: off [fixed]
tx-gso-robust: off [fixed]
tx-fcoe-segmentation: off [fixed]
tx-gre-segmentation: off [fixed]
tx-ipip-segmentation: off [fixed]
tx-sit-segmentation: off [fixed]
tx-udp_tnl-segmentation: off [fixed]
tx-mpls-segmentation: off [fixed]
fcoe-mtu: off [fixed]
tx-nocache-copy: off
loopback: off [fixed]
rx-fcs: off
rx-all: off
tx-vlan-stag-hw-insert: off [fixed]
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: off [fixed]
busy-poll: off [fixed]
tx-sctp-segmentation: off [fixed]
l2-fwd-offload: off [fixed]
hw-tc-offload: off [fixed]


Nic details



[root@server1 ~]# lspci | grep -i eth
09:00.0 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet Controller (Copper) (rev 06)
09:00.1 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet Controller (Copper) (rev 06)
0a:00.0 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet Controller (Copper) (rev 06)
0a:00.1 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet Controller (Copper) (rev 06)


It does support MSI too lspci -vv but it doesn't have MSI-X



09:00.0 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet Controller (Copper) (rev 06)
Subsystem: Hewlett-Packard Company NC364T PCI Express Quad Port Gigabit Server Adapter
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin B routed to IRQ 86
NUMA node: 0
Region 0: Memory at f7de0000 (32-bit, non-prefetchable) [size=128K]
Region 1: Memory at f7d00000 (32-bit, non-prefetchable) [size=512K]
Region 2: I/O ports at 6000 [size=32]
Capabilities: [c8] Power Management version 2
Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME-
Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
Address: 00000000fee00c18 Data: 0000
Capabilities: [e0] Express (v1) Endpoint, MSI 00
DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <512ns, L1 <64us
ExtTag- AttnBtn- AttnInd- PwrInd- RBE- FLReset- SlotPowerLimit 0.000W
DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ Unsupported+
RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
MaxPayload 256 bytes, MaxReadReq 4096 bytes
DevSta: CorrErr- UncorrErr+ FatalErr- UnsuppReq+ AuxPwr+ TransPend-
LnkCap: Port #2, Speed 2.5GT/s, Width x4, ASPM L0s, Exit Latency L0s <4us, L1 <64us
ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 2.5GT/s, Width x4, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
Capabilities: [100 v1] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-
UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
AERCap: First Error Pointer: 14, GenCap- CGenEn- ChkCap- ChkEn-
Capabilities: [140 v1] Device Serial Number 00-26-55-ff-ff-d2-1a-6c
Kernel driver in use: e1000e
Kernel modules: e1000e






share|improve this question






















  • Has that network a lot of Windows machines and/or a lot of broadcasts/multicasts? Are you using jumbo frames where you shouldnt? The NIC is only one part of the equation.
    – Rui F Ribeiro
    Jan 14 at 13:43











  • No i am not using jumbo frame, its and no other custom setting i did, i am running CentOS7, but or most of traffic is UDP streaming, if packet rate is low under 200kpps everything works but when it go high i noticing drops
    – Satish
    Jan 14 at 13:45











  • Are you sure it is 200kbps? It seems too low for that kind of problems.
    – Rui F Ribeiro
    Jan 14 at 13:49






  • 1




    I am saying 200,000 packet per second arrival rate of packets not a size of packet, my traffic is 200Mbps don't get confused in kbps it is kpps
    – Satish
    Jan 14 at 13:54












up vote
0
down vote

favorite









up vote
0
down vote

favorite











I have Server with 32 core CPU and it has NC364T Quad 1G ethernet and i have notice when packet rate hit 250kpps (200,000 packet per second) i start seeing intermediate drops in ifconfig output and in my system ksoftirqd/0 process start eating CPU because it stick to single CPU0 instead of spread load across all CPU.



[root@server1 ~]# ifconfig | grep drop
RX errors 0 dropped 3019497 overruns 0 frame 0
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0


This is what i did try to resolve issue.




  1. increase RX ring buffer size to max 4096



    ethtool -G enp9s0f0 rx 4096




  2. Try to adjust affinity but still no luck



    cat "ff" > /proc/irq/86/smp_affinity



But still i am seeing drops, here is the some output



[root@server1 ~]# ethtool -S enp9s0f0 | grep rx
rx_packets: 21351639073
rx_bytes: 2150885430317
rx_broadcast: 3520806
rx_multicast: 1054
rx_errors: 0
rx_length_errors: 0
rx_over_errors: 0
rx_crc_errors: 0
rx_frame_errors: 0
rx_no_buffer_count: 49483446
rx_missed_errors: 2997479
rx_long_length_errors: 0
rx_short_length_errors: 0
rx_align_errors: 0
rx_flow_control_xon: 0
rx_flow_control_xoff: 0
rx_csum_offload_good: 21089322623
rx_csum_offload_errors: 107538
rx_header_split: 0
alloc_rx_buff_failed: 0
rx_smbus: 0
rx_dma_failed: 0
rx_hwtstamp_cleared: 0


details setting



[root@server1 ~]# ethtool -k enp9s0f0
Features for enp9s0f0:
rx-checksumming: on
tx-checksumming: on
tx-checksum-ipv4: off [fixed]
tx-checksum-ip-generic: on
tx-checksum-ipv6: off [fixed]
tx-checksum-fcoe-crc: off [fixed]
tx-checksum-sctp: off [fixed]
scatter-gather: on
tx-scatter-gather: on
tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: on
tx-tcp-segmentation: on
tx-tcp-ecn-segmentation: off [fixed]
tx-tcp6-segmentation: on
udp-fragmentation-offload: off [fixed]
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off [fixed]
rx-vlan-offload: on
tx-vlan-offload: on
ntuple-filters: off [fixed]
receive-hashing: on
highdma: on [fixed]
rx-vlan-filter: on [fixed]
vlan-challenged: off [fixed]
tx-lockless: off [fixed]
netns-local: off [fixed]
tx-gso-robust: off [fixed]
tx-fcoe-segmentation: off [fixed]
tx-gre-segmentation: off [fixed]
tx-ipip-segmentation: off [fixed]
tx-sit-segmentation: off [fixed]
tx-udp_tnl-segmentation: off [fixed]
tx-mpls-segmentation: off [fixed]
fcoe-mtu: off [fixed]
tx-nocache-copy: off
loopback: off [fixed]
rx-fcs: off
rx-all: off
tx-vlan-stag-hw-insert: off [fixed]
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: off [fixed]
busy-poll: off [fixed]
tx-sctp-segmentation: off [fixed]
l2-fwd-offload: off [fixed]
hw-tc-offload: off [fixed]


Nic details



[root@server1 ~]# lspci | grep -i eth
09:00.0 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet Controller (Copper) (rev 06)
09:00.1 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet Controller (Copper) (rev 06)
0a:00.0 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet Controller (Copper) (rev 06)
0a:00.1 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet Controller (Copper) (rev 06)


It does support MSI too lspci -vv but it doesn't have MSI-X



09:00.0 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet Controller (Copper) (rev 06)
Subsystem: Hewlett-Packard Company NC364T PCI Express Quad Port Gigabit Server Adapter
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin B routed to IRQ 86
NUMA node: 0
Region 0: Memory at f7de0000 (32-bit, non-prefetchable) [size=128K]
Region 1: Memory at f7d00000 (32-bit, non-prefetchable) [size=512K]
Region 2: I/O ports at 6000 [size=32]
Capabilities: [c8] Power Management version 2
Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME-
Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
Address: 00000000fee00c18 Data: 0000
Capabilities: [e0] Express (v1) Endpoint, MSI 00
DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <512ns, L1 <64us
ExtTag- AttnBtn- AttnInd- PwrInd- RBE- FLReset- SlotPowerLimit 0.000W
DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ Unsupported+
RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
MaxPayload 256 bytes, MaxReadReq 4096 bytes
DevSta: CorrErr- UncorrErr+ FatalErr- UnsuppReq+ AuxPwr+ TransPend-
LnkCap: Port #2, Speed 2.5GT/s, Width x4, ASPM L0s, Exit Latency L0s <4us, L1 <64us
ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 2.5GT/s, Width x4, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
Capabilities: [100 v1] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-
UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
AERCap: First Error Pointer: 14, GenCap- CGenEn- ChkCap- ChkEn-
Capabilities: [140 v1] Device Serial Number 00-26-55-ff-ff-d2-1a-6c
Kernel driver in use: e1000e
Kernel modules: e1000e






share|improve this question














I have Server with 32 core CPU and it has NC364T Quad 1G ethernet and i have notice when packet rate hit 250kpps (200,000 packet per second) i start seeing intermediate drops in ifconfig output and in my system ksoftirqd/0 process start eating CPU because it stick to single CPU0 instead of spread load across all CPU.



[root@server1 ~]# ifconfig | grep drop
RX errors 0 dropped 3019497 overruns 0 frame 0
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0


This is what i did try to resolve issue.




  1. increase RX ring buffer size to max 4096



    ethtool -G enp9s0f0 rx 4096




  2. Try to adjust affinity but still no luck



    cat "ff" > /proc/irq/86/smp_affinity



But still i am seeing drops, here is the some output



[root@server1 ~]# ethtool -S enp9s0f0 | grep rx
rx_packets: 21351639073
rx_bytes: 2150885430317
rx_broadcast: 3520806
rx_multicast: 1054
rx_errors: 0
rx_length_errors: 0
rx_over_errors: 0
rx_crc_errors: 0
rx_frame_errors: 0
rx_no_buffer_count: 49483446
rx_missed_errors: 2997479
rx_long_length_errors: 0
rx_short_length_errors: 0
rx_align_errors: 0
rx_flow_control_xon: 0
rx_flow_control_xoff: 0
rx_csum_offload_good: 21089322623
rx_csum_offload_errors: 107538
rx_header_split: 0
alloc_rx_buff_failed: 0
rx_smbus: 0
rx_dma_failed: 0
rx_hwtstamp_cleared: 0


details setting



[root@server1 ~]# ethtool -k enp9s0f0
Features for enp9s0f0:
rx-checksumming: on
tx-checksumming: on
tx-checksum-ipv4: off [fixed]
tx-checksum-ip-generic: on
tx-checksum-ipv6: off [fixed]
tx-checksum-fcoe-crc: off [fixed]
tx-checksum-sctp: off [fixed]
scatter-gather: on
tx-scatter-gather: on
tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: on
tx-tcp-segmentation: on
tx-tcp-ecn-segmentation: off [fixed]
tx-tcp6-segmentation: on
udp-fragmentation-offload: off [fixed]
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off [fixed]
rx-vlan-offload: on
tx-vlan-offload: on
ntuple-filters: off [fixed]
receive-hashing: on
highdma: on [fixed]
rx-vlan-filter: on [fixed]
vlan-challenged: off [fixed]
tx-lockless: off [fixed]
netns-local: off [fixed]
tx-gso-robust: off [fixed]
tx-fcoe-segmentation: off [fixed]
tx-gre-segmentation: off [fixed]
tx-ipip-segmentation: off [fixed]
tx-sit-segmentation: off [fixed]
tx-udp_tnl-segmentation: off [fixed]
tx-mpls-segmentation: off [fixed]
fcoe-mtu: off [fixed]
tx-nocache-copy: off
loopback: off [fixed]
rx-fcs: off
rx-all: off
tx-vlan-stag-hw-insert: off [fixed]
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: off [fixed]
busy-poll: off [fixed]
tx-sctp-segmentation: off [fixed]
l2-fwd-offload: off [fixed]
hw-tc-offload: off [fixed]


Nic details



[root@server1 ~]# lspci | grep -i eth
09:00.0 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet Controller (Copper) (rev 06)
09:00.1 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet Controller (Copper) (rev 06)
0a:00.0 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet Controller (Copper) (rev 06)
0a:00.1 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet Controller (Copper) (rev 06)


It does support MSI too lspci -vv but it doesn't have MSI-X



09:00.0 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet Controller (Copper) (rev 06)
Subsystem: Hewlett-Packard Company NC364T PCI Express Quad Port Gigabit Server Adapter
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin B routed to IRQ 86
NUMA node: 0
Region 0: Memory at f7de0000 (32-bit, non-prefetchable) [size=128K]
Region 1: Memory at f7d00000 (32-bit, non-prefetchable) [size=512K]
Region 2: I/O ports at 6000 [size=32]
Capabilities: [c8] Power Management version 2
Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME-
Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
Address: 00000000fee00c18 Data: 0000
Capabilities: [e0] Express (v1) Endpoint, MSI 00
DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <512ns, L1 <64us
ExtTag- AttnBtn- AttnInd- PwrInd- RBE- FLReset- SlotPowerLimit 0.000W
DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ Unsupported+
RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
MaxPayload 256 bytes, MaxReadReq 4096 bytes
DevSta: CorrErr- UncorrErr+ FatalErr- UnsuppReq+ AuxPwr+ TransPend-
LnkCap: Port #2, Speed 2.5GT/s, Width x4, ASPM L0s, Exit Latency L0s <4us, L1 <64us
ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 2.5GT/s, Width x4, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
Capabilities: [100 v1] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-
UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
AERCap: First Error Pointer: 14, GenCap- CGenEn- ChkCap- ChkEn-
Capabilities: [140 v1] Device Serial Number 00-26-55-ff-ff-d2-1a-6c
Kernel driver in use: e1000e
Kernel modules: e1000e








share|improve this question













share|improve this question




share|improve this question








edited Jan 14 at 14:11

























asked Jan 14 at 13:35









Satish

61211129




61211129











  • Has that network a lot of Windows machines and/or a lot of broadcasts/multicasts? Are you using jumbo frames where you shouldnt? The NIC is only one part of the equation.
    – Rui F Ribeiro
    Jan 14 at 13:43











  • No i am not using jumbo frame, its and no other custom setting i did, i am running CentOS7, but or most of traffic is UDP streaming, if packet rate is low under 200kpps everything works but when it go high i noticing drops
    – Satish
    Jan 14 at 13:45











  • Are you sure it is 200kbps? It seems too low for that kind of problems.
    – Rui F Ribeiro
    Jan 14 at 13:49






  • 1




    I am saying 200,000 packet per second arrival rate of packets not a size of packet, my traffic is 200Mbps don't get confused in kbps it is kpps
    – Satish
    Jan 14 at 13:54
















  • Has that network a lot of Windows machines and/or a lot of broadcasts/multicasts? Are you using jumbo frames where you shouldnt? The NIC is only one part of the equation.
    – Rui F Ribeiro
    Jan 14 at 13:43











  • No i am not using jumbo frame, its and no other custom setting i did, i am running CentOS7, but or most of traffic is UDP streaming, if packet rate is low under 200kpps everything works but when it go high i noticing drops
    – Satish
    Jan 14 at 13:45











  • Are you sure it is 200kbps? It seems too low for that kind of problems.
    – Rui F Ribeiro
    Jan 14 at 13:49






  • 1




    I am saying 200,000 packet per second arrival rate of packets not a size of packet, my traffic is 200Mbps don't get confused in kbps it is kpps
    – Satish
    Jan 14 at 13:54















Has that network a lot of Windows machines and/or a lot of broadcasts/multicasts? Are you using jumbo frames where you shouldnt? The NIC is only one part of the equation.
– Rui F Ribeiro
Jan 14 at 13:43





Has that network a lot of Windows machines and/or a lot of broadcasts/multicasts? Are you using jumbo frames where you shouldnt? The NIC is only one part of the equation.
– Rui F Ribeiro
Jan 14 at 13:43













No i am not using jumbo frame, its and no other custom setting i did, i am running CentOS7, but or most of traffic is UDP streaming, if packet rate is low under 200kpps everything works but when it go high i noticing drops
– Satish
Jan 14 at 13:45





No i am not using jumbo frame, its and no other custom setting i did, i am running CentOS7, but or most of traffic is UDP streaming, if packet rate is low under 200kpps everything works but when it go high i noticing drops
– Satish
Jan 14 at 13:45













Are you sure it is 200kbps? It seems too low for that kind of problems.
– Rui F Ribeiro
Jan 14 at 13:49




Are you sure it is 200kbps? It seems too low for that kind of problems.
– Rui F Ribeiro
Jan 14 at 13:49




1




1




I am saying 200,000 packet per second arrival rate of packets not a size of packet, my traffic is 200Mbps don't get confused in kbps it is kpps
– Satish
Jan 14 at 13:54




I am saying 200,000 packet per second arrival rate of packets not a size of packet, my traffic is 200Mbps don't get confused in kbps it is kpps
– Satish
Jan 14 at 13:54















active

oldest

votes











Your Answer







StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);








 

draft saved


draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f417025%2fnc364t-nic-lots-of-drop-when-traffic-hit-250kpps-rate%23new-answer', 'question_page');

);

Post as a guest



































active

oldest

votes













active

oldest

votes









active

oldest

votes






active

oldest

votes










 

draft saved


draft discarded


























 


draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f417025%2fnc364t-nic-lots-of-drop-when-traffic-hit-250kpps-rate%23new-answer', 'question_page');

);

Post as a guest













































































Popular posts from this blog

How to check contact read email or not when send email to Individual?

Bahrain

Postfix configuration issue with fips on centos 7; mailgun relay