Multiple vehicle tracking in aerial video has vital applications in intelligent transportation system. However, this is challenging owing to the demand for higher tracking efficiency. Moreover, it is particularly challenging to handle shadows and occlusions in complex traffic scenes. Currently, most approaches mainly address the multiple vehicle tracking problem based on consecutive frame pairs. However, they are inefficient and ineffective with respect to the tracking of vehicles that are shadowed or occluded by trees, buildings, or large vehicles. Therefore, in terms of consecutive multiframe images, we propose an efficient deshadowing approach for multiple vehicle tracking in aerial video via image segmentation and local region matching. By processing a series of frames at a time, this approach will address the occurrence of shadows and occlusions in multiple vehicle tracking. First, image segmentation and road segmentation are combined to construct the region adjacency graph within the road (RAGR) for each frame. Local region matching is then performed among the RAGRs based on the distance transform. In particular, to improve the region matching efficiency, a local matching criterion is proposed by introducing region of interest for the node of the RAGR. Finally, based on the motion dissimilarities and varying spatial distances of the background and moving vehicles, a batch-oriented foreground separation is achieved to separate vehicles (the foreground) from the background. Experiments and comparative analysis conducted on UAV123 and DARPA VIVID data sets demonstrate that the proposed approach for multiple vehicle tracking in aerial video can generate satisfactory and competitive results.