Serving large data sets at acceptable latency typically requires some proportional amount of RAM to keep frequently accessed files and indexes in memory (especially if you're serving off of HDDs).
It depends on the service, but we frequently find geo services are RAM limited up to a certain point (at which point there's enough cached to make it CPU limited).
Valhalla in particular in an in-memory router, so it's holding the whole transportation graph in memory. Not sure about the tiling and geocoding pieces. Most organizations with a "need to serve the world" requirement presumably don't balk at buying all-the-RAMs. Amazingly, the max-RAM-in-machine number has grown faster than the data-in-the-world number, and just being memory bound is a perfectly reasonable design decision, given the performance and complexity wins.