This is because of caching of the site map. Currently we clear the cache when a new page is created or one is deleted by touching cache dependency files down under /Data/Sites/[SiteID]/systemfiles so that is why the other nodes are not updating right away because their copy of the dependency file did not get touched. So next time the app pool recycles it would clear the cache or once the sync happens it may touch the file and clear the cache.
I think a simple failover server arrangement should work fine but we don't yet officially support Web Farms and will probably need to revisit our caching strategy for that scenario. So if you are frequently changing back and forth the active node it will result in this kind of problem whereas if the failover is a once in a while thing it should work fine.
I can say that in the next few months I'll be focusing on WebFarm support. I've got something else I'm working on at the moment that I need to finish first, but Web Farm support is next on my list.