Machine learning is mostly non-Bayesian. There is a significant Bayesian contingent in the machine learning literature, but it is decidedly a minority, since non-Bayesian procedures exist which accomplish the same thing through regularization, with dramatically lower computing costs. You can argue that regularization is Bayesian in spirit if not in letter, but it is entirely comprehensible and defensible from a frequentist perspective--indeed, this is why most Bayesian procedures have reasonably good frequentist properties. The typical machine learning practitioner uses Bayes when it is convenient, but uses other methods when it isn't.
Bayes appeals to a lot of people relatively new to statistics, because of its conceptual simplicity, intuitiveness and uniformity of method. It really is a very elegant way to think about statistics. Frequentism is usually not presented in such a way, so people fail to realize it is not just a hodge podge of tools (it is important to note the differences between classical statistics, with its hodge podge of tools, and frequentist statistics, which similar to Bayes has a single, elegant overarching principle guiding its approach to statistics).
There is a philosophical discussion to be had about the interpretation of probability, but these days most everyone agrees that probability is useful for both representing variation under replication as well as epistemic uncertainty. These are different things, to be sure, and theorems showing practical harmony between frequentist and Bayesian approaches (e.g., complete class theorems) don't change this fundamental philosophical gulf.
The real difference in applied settings is between goals and methods. To do Bayes, you use the Bayesian method (in brief, prior + likelihood => posterior). Anything that comes out of this method is solid Bayes. To do frequentism, you use any method that has good frequentist properties. There is no completely general purpose recipe for good frequentist properties, but there is also no completely general guarantee that Bayes will yield good frequentist properties.
Strict Bayesians, who insist on using the Bayesian approach in every problem, are settling for easy answers over good answers. For example, in a given problem, it may be extremely difficult to produce a confidence interval with 95% coverage. The Bayesian can easily produce a credible interval with 95% posterior probability, however, using their chosen method. They pretend that this is an advantage of Bayes, but note that they've changed the subject. If their interval has 95% coverage, then it would be just fine by frequentist lights, but it doesn't, which is why frequentists don't have an answer. Bayesians will often respond that what scientists really want is credibility, not confidence, so frequentists have the wrong goal, but I wouldn't be so sure. It's nice to be able to say "the probability that theta lies in (-1, 1) is 95%", but it's embarrassing to point out that the coverage of this credible interval might be zero. Scientists may sometimes interpret confidence intervals as credible intervals, but they also expect their intervals to trap the truth most of the time. Bayes doesn't offer any such guarantee, but frequentism does.
In the end, I think what we really want is frequentist properties, and in most cases Bayesian methods are one way (but not the only way) to secure those properties, along with the elegance of the Bayesian algorithm. But in cases where Bayes does not perform well by frequentist standards, I think good science means working harder to find something that does, not moving the goalposts and saying "well it's Bayesian so it's automatically right".