“A better algorithm” is always going to be significantly faster; otherwise it's not a better algorithm.
When I got started in this business, computer-cycles were a thing that had to be rigorously conserved. But it wasn't just the CPU; it was every part of the hardware. Things were smaller and slower. Algorithms had to be smarter just to get the work done. And it was done. (Imagine a timesharing computer with a 1.5 mHz clock and 512K of memory with 32 terminals attached to it, all being used to register a college of 5,000 students for classes ... with less than one-second response time to any request, even if every user hit the Enter key at precisely the same instant, as we actually confirmed.)
Now, we've got an embarrassment of riches. We can throw hundreds of pounds of silicon at any problem. A great big free sand-pile. But there are still choke-points, and if those choke-points are not clearly taken into account by the algorithm, we'll have poor performance and 0.01% CPU-utilization. (The two often occur together.)