This study does a thorough and satisfactory job of correlating vocal differences between individuals from different populations with geographic distance. There are a few technical issues, although I think these could definitely be addressed and the manuscript resubmitted without too much more work. My first thought is that time needs to be addressed. The recordings are made in different years, so you need to make sure that the year isn't affecting the difference in recording. This could be done by looking at one population over years (if you have multiple recordings over time for one population) or including time as a covariate in your correlation. Since you have this information, it shouldn't be too hard to do, and it hopefully wouldn't affect the results too much.
My second issue is just the matter of the writing. It would be good to have a really thorough proofread by multiple authors to make sure the English is correct. I've actually gone through the manuscript and suggested changes to improve on this point. While it doesn't necessarily change the conclusions of the study, it does make a difference to how people will read it, and therefore its validity and impact. It also helps make sure things are not misinterpreted/misread.
Finally, I just suggest that you be very careful when making conclusions regarding the connection between your study and genetic structure. Because you don't include genetics, you can only speculate and hypothesize about how vocalizations in this species might reflect genetic data. Just reword things so that it does not sound like you tested this, but are rather suggesting how your study provides evidence for a theory and maybe even provides the basis for further research.
I think with these changes the manuscript would be perfectly suitable for publication and would be a useful reference for those studying in this field.