Parallel systems based on multi-core or many-core processors have more complicated structure and more tasks running on it than traditional uni-core based systems. To allocate tasks in this kind of systems efficiently, a task allocation model based on the task interaction graph is built, in which the processes, threads and communications among them are taken into account. And then an iteration-based heuristic algorithm is proposed based on the model. The algorithm is composed of two rounds of operations, in which the processes are assigned to processing nodes in the first round and the threads of a process are assigned to processing cores in the second round respectively. Each round of operation partitions the task interaction graph by iterations with backtracking. As a result, the final partitioned graph corresponds to the task allocation solution which assigns tasks to processing cores. The heuristic algorithm is evaluated by comparing it with exhaustive searching and genetic algorithms using 1400 random-generated task interaction graphs, and the results show that the proposed heuristic algorithm can find near-optimal solutions in reasonable time, and behave better in scalability than genetic algorithms when the number of threads increases, since it can find solutions in much less time than genetic algorithms.