Among different gesture types, static gestures or postures deliver a broad range of communicative information like commands or emblems. Vision-based processing for posture recognition is the most intuitive yet challenging task in intelligent systems. Achievements in deep learning, specifically convolu- tional neural networks (CNN), replaced creating hand models or engineering features for automated image feature learning at the expense of large data requirements and long training sessions for optimal parameter tuning. The aim of the present study is to explore the potentials of sparse autoencoders for posture recognition, promoting an alternative method to present convolutional approaches. We conduct experiments with hierarchically designed autoencoders to retain the desired image feature abstractions on two posture datasets with distinct characteristics. The different data properties allow us to demonstrate parameter influences on the network performance. Our evaluation shows that even a shallow network design achieves superior perfor- mance compared to a multiple channel CNN, and comparable results on a small dataset with sparse image samples. From our study we conclude that 'lightweight' approaches can be viable tools for posture recognition, which are worth more explorations in the future.