The goal of autonomous driving is to provide a safe, comfortable and efficient driving environment for people. In order to have wide-spread deployment of autonomous driving systems, we need to process the sensory data from multiple streams in a timely and accurate fashion. The challenges that arise are thus two-fold: leveraging the multiple sensors that are available on autonomous vehicles to boost the perception accuracy; jointly optimizing perception models and the underlying computing models to meet the real-time requirements. To address these challenges, this paper surveys the latest research on sensing and edge computing for autonomous driving and presents our own autonomous driving system, Sonic. Specifically, we propose a multi-modality perception model, ImageFusion, that combines the lidar data and camera data for 3D object detection, and a computational optimization framework, MPInfer.