With no internal speed-up required and parallel scheduling at input and output, the CICQ (combined input crosspoint queued) switch architecture using RR (round robin) algorithm provides unique advantage of designing high performance switches. However, it cannot achieve 100% throughput under non-uniform traffic. The performance of RR algorithm under non-uniform traffic comes from two critical factors: one is the buffer capacity of each crosspoint and the other is the service loss. Based the theoretical study, a high-throughput scheduling algorithm with small crosspoint buffers is presented. Simulations demonstrate that the new algorithm can achieve 100% throughput under arbitrary traffic using only one buffer cell in each crosspoint. The new algorithm keeps the high simplicity and efficiency of RR-RR with O(1) complexity while overcoming the instability problem of RR-RR.