In this work, we address the problem of monocular tracking the human motion based on the discriminative sparse representation. The proposed method jointly trains the dictionary and the discriminative linear classifier to separate the human being from the background. We show that using the online dictionary learning, the tracking algorithm can adapt the variation of human appearance and background environment. We compared the proposed method with four state-of-the-art tracking algorithms on eight benchmark video clips (Faceocc, Sylv, David, Singer, Girl, Ballet, OneLeaveShopReenter2cor, and ThreePastShop2cor). Qualitative and quantitative experimental validation results are discussed at length. The proposed algorithm for human tracking achieves superior tracking results, and a Matlab run time on a standard desktop machine of four frames per second.