Most studies about deep learning are built on neural networks, i.e., multiple layers of parameterized differentiable nonlinear modules trained by backpropagation. Recently, deep forest was proposed as a non-NN style deep model, which has much fewer parameters than deep neural networks. It shows robust performance under different hyperparameter settings and across different tasks, and the model complexity can be determined in a data-dependent style. Represented by gcForest, the study of deep forest provides a promising way of building deep models based on non-differentiable modules. However, deep forest is now used offline which inhibits its application in many real tasks, e.g., in the context of learning from data streams. In this work, we explore the possibility of building deep forest under the incremental setting and propose Mondrian deep forest. It has a cascade forest structure to do layer-by-layer processing. And we further enhance its layer-by-layer processing by devising an adaptive mechanism, which is capable of adjusting the attention to the original features versus the transformed features of the previous layer, therefore notably mitigating the deficiency of Mondrian forest in handling irrelevant features. Empirical results show that, while inheriting the incremental learning ability of Mondrian forest, Mondrian deep forest has a significant improvement in performance. And using the same default setting of hyperparameters, Mondrian deep forest is able to achieve satisfying performance across different datasets. In the incremental training setting, Mondrian deep forest achieves highly competitive predictive performance with periodically retrained gcForest while being an order of magnitude faster.