The coding mode space of high efficiency video coding (HEVC) is extremely large so it needs huge amount of computations for HEVC encoders to do mode decision (MD). Parallelizing HEVC encoding on many-core platforms is an efficient and promising approach to fulfill the high computational demands. Traditional coarse-grained parallelizing schemes such as Tiles and wavefront parallel processing (WPP) either cause too much quality loss or cant afford a high parallelism degree. In this paper, the potential parallelism in HEVC intra MD process is exploited, and a multi-level and fine-grained highly parallel intra MD method which works in a coding tree unit (CTU) is proposed. Specifically, the intra MD process in a CTU is divided into six types of sub-tasks, and the data dependencies among adjacent blocks that hinder parallel processing are analyzed and removed, including intra prediction dependency, prediction mode dependency and entropy coding dependency; consequently the MD computation for all fine-grained coding blocks of different levels within the same CTU can be computed concurrently. The proposed parallel MD method is implemented on Tile-Gx36 platform. Experimental results show that the proposed parallel MD method gets an overall speed up of more than 18x with acceptable quality loss (about 3% bit-rate increasing), compared with the non-parallel baseline HM.