A heap is a specialized datastructure that can be thought of as an "automatically sorted array" given a somewhat scaled down definition of "sorted". Sorted in this case means that the largest element is always easy to find and remove, and inserting new elements is also easy.
Given that, our steps to find the smallest M elements would be:
- Build a heap from our first M items
- Compare next item to largest element in heap
- Replace largest with new if new < largest
- Repeat steps 2 and 3 for all remaining items
As you can see, this is very similar to the strategy you proposed. In fact, you could view heaps as a datastructure designed specifically to implement this algorithm efficiently.