"Most" don't need to. Merely some quitting for gender-related mistreatment is sufficient to produce their results, leaving sexism the causal mechanism that is responsible for both the pay and promotion differences, and for the attrition, experience, and age differences that they are "controlling" for.
It also said in the article that internal promotions between men and women is generally equal; however, men being offered higher positions with other companies is much more common than women being given the same opportunities. That said, internal promotions compared to external promotions statistically produce lower earnings.
You're missing the fact that the women were making more until they disappeared from the data.
No women weren't paid more before they left. The results show higher pay for women only "After controlling for the background variables". That doesn't mean any women were paid more than men of equal rank. It means that if you eliminate those aspects of being a woman that relate to why women quit or move to another company more often and why the young employees are disproportionately women, then the remaining aspects of being a women are predicted to (not observed to) have a positive impact on salary.
The reason that such analyses are called "Inferential Statistics" is because they do not describe what is actually true about the observed data or observed relationships. They provide estimates of what is predicted to be true in a hypothetical universe in which various pathways between variables do not exists and thus the measured variables are not what they actually are but only those sub-dimensions of the variables that are not related to those eliminated pathways.
Multiple regression coefficients can reflect nonsensical values that don't or would never actually exist. For example regression coefficients also tell you what would be true of Y when at a given level of X, when a third variable Z is at a value that isn't even possible in the real world. They can tell you the salary of a person who is negative 1000 years old.
In the present case, suppose that due to sexism young hot women start out with slightly higher pay than men. Maybe the company wants female underlings but not as many in higher ranks, so it pays them higher than men at the start, but they do not get equal raises to men as they get older and have more job experience. So, the men pass them by in pay and the older women get pissed and leave. The result would be just what their data show. If you run a regression controlling for age and experience, it will take the higher starting pay of the women and assume that age and experience impact raises equally for men and women, extrapolating what each group's salary would be if they had the same age and experience. But again, age and experience differences could easily be effects of attrition due to differential pay raises over time, so the assumption the analysis is making is invalid.
The results do not say what women who don't leave are getting paid compared to men. They predict what the people categorized as "female" would get paid in a world where female's do not differ from males in any way that is related to quitting, transferring, and thus age and experience are eliminated.
Only if that hypothetical non-existent universe is the same as the real one is all important theoretical ways is that a meaningful result.
Another way to think about this is that adding a control variable changes what the main variables actually represent, especially when you are not measuring a singular, isolated physical property of the thing like in the natural sciences. Control variables transform the main predictors into a new variable that reflects only the portion of the variance in the main variable that has no relationship to the control variables. If the critical aspects of that main variable are related to the control variables, then you have changed the main variable into something qualitatively lacking the core features of what you are trying to measure. In our example, there may be central features of gender that trigger sexism which then impacts attrition and thus avg age and experience of women on the job. By controlling for age and experience, the gender variable changes into those things correlated with gender that have no impact on or relation to anything that might impact attrition, such as sexism.
An analogy would be if your main predictor was self-reported gender (like in this study), but then you controlled for whether a person has a Y chromosome. You would likely alter how the gender variable is related to almost everything, because you have altered what its remaining variance represents, which in this case would be almost nothing about gender in general, and only variance tied to whether a person has a gender identity that differs from their sex as indicated by their chromosomes.
Obviously, that is the extreme case to illustrate the problem, but a milder version of that problem occurs every time a control variable is used that could be causally related to much of what your trying to capture with your main variable.
In sum, the general problem, and a common one in the Social Sciences, is a naive notion that entering control variables show you the "true" relationship between variables. This is bullshit. The true relationship is the simply two-variable correlation. That is the observed relationship. Everything else is presumption filled inference. There are 2 general uses of control variables that are valid. The first is to show that the observed relationship (simple correlation) is estimated to be largely the same even in hypothetical universes where all pathways related to the control variables don't exist. That supports the inference that the observed relationship is via some other pathways. The second valid use is to show that the observed relationship does change in that hypothetical universe without those pathways. That supports the inference that this eliminated pathway is the meaningful one in the real world, driving the actual relationship between the variables. Notice that in both cases the conclusions are focused on why the actual observed relationship exists, NOT on assuming that the estimated relationship in the hypothetical universe is meaningful because it isn't. Its fiction. Its variables don't have some of the often key dimensions that they do in the real world, and it doesn't allow pathways that actually exist and matter for outcomes.
A red flag for statistical bullshit is when multivariate analyses do not start by showing you the actual two-way correlations between the variables, and then use the regression coefficients not as meaningful in themselves but as evidence for interpreting the nature of the observed relationship, which is the simple aggregate correlation.