ECN_PFC_DCQCN_RoCE

 


 

5 points you can understand why using ECN and PFC Together to Build Lossless Ethernet Networks in AI infrastructure:

 1- Both ECN and PFC by themselves can manage congestion very well. By working together, they can be even more effective.


2- ECN can react first to mitigate congestion. If ECN does not react fast enough and buffer utilization continues to build up, PFC behaves as a fail-safe and prevents traffic drops. This is the most efficient way to manage congestion and build lossless Ethernet networks.


3-This collaborative process between PFC and ECN where they managed congestion together is called Data Center Quantized Congestion Notification (DCQCN) and is developed for RoCE networks.


4- By working together, PFC and ECN provide efficient end-to-end congestion management. When the system is experiencing minor congestion where buffer usage is moderate, WRED with ECN manages the congestion seamlessly.


5- For both WRED and ECN to work as described, you should set appropriate thresholds. In the following example, the WRED minimum and maximum threshold are set for lower buffer utilization to mitigate congestion first, and the PFC threshold is set higher as a safety net to mitigate congestion after ECN. Both ECN and PFC work on a queue that is no-drop, providing lossless transport.


 The below example both Host A and B send traffic to Host X. As leaf uplinks provide enough bandwidth for all traffic to arrive to Leaf X, the congestion point is on the outgoing interface toward Host X.





Comments

Popular posts from this blog

gNMI_with_grafana on containerlabs

EVPN Route type-1 & type-4 in action

Network Automation with ROBOT Framework