This is due to how video files work.
In general, a video file is a container of multiple separate streams. There will be a video stream where the video data actually is, and this video data is encoded with a video codec such as Xvid for MPEG-4. A separate audio stream will have the audio data, which is encoded with an audio codec such as AAC.
This AAC encoded data which you can in an MP4 video file is exactly the same format of data you can find in an MP3 audio file. In other words, it's like if a video file had an audio file inside of it.
In order for a video player to work, it must be able to decode the audio data in the audio stream inside video files. If it's capable of doing that, most of the work needed in order for the application to open audio files is already done, so it's trivial for the video player to also become able to open audio files.
This is why video players are often "media players" or "multimedia players" capable of opening both audio and video.
It's worth noting that the functionality to play music is typically just something that the video player CAN do, it's not something it's specialized for doing. In particular, a video player will have a large main pane for displaying video that serves no purpose whatsoever when playing music (except for displaying visualizations in players that support that). By contrast, a music player will prominently featured the playlist instead.
Video players may also struggle with music formats that aren't used inside video files, such as MIDI.