The CICQ switch fabric is an ideal solution to multi-terabit switch implementation owing to its nice distributed scheduling property. Round-robin algorithms have been extensively studied because of their simplicity for hardware implementation. It is known that round-robin algorithms provide high throughput under uniform traffic; however, the performance is degraded under nonuniform traffic. In this paper, the reason for the performance degradation of the existing round-robin algorithms is pointed out and then a class of dual round-robin algorithms is proposed. For the proposed algorithms, each input arbiter is associated with dual round-robin pointers named the primary pointer and the secondary pointer respectively. The input queue corresponding to the primary pointer has the highest priority being scheduled, and the decision for updating the primary pointer can be dynamically made relying on the input queue status. When the input queue corresponding to the primary pointer is blocked, other input queues can be uniformly scheduled according to the secondary pointer position. Simulations show that the dual round-robin algorithms can significantly improve the performance of the CICQ switch under nonuniform traffic.