Thursday 26 October 2017

How do all of these "Save video from YouTube" services work?


I mean, how do they work, generally? How do they receive the link to a video stream itself (not just the page containing a Flash player)?


I did a search on the web but couldn't find anything useful (all links point to such services, but none of them tell how they are actually implemented).



Answer



There is a very popular open source command-line downloader called youtube-dl, which does exactly that. It grabs the actual video and audio file links from a given YouTube link – or any other popular web video site like Vimeo, Yahoo! Video, uStream, etc.


To see how that's done, look into the YouTube extractor. That's just too much to show here. Other extractors exist for simpler sites. Steven Penny has a simple JavaScript downloader for YouTube too, which is a little more straightforward.


But basically, for a Flash video player, it must be initialized and configured through some JavaScript. Simply speaking, the Flash object's player will receive a URL of a video stream to load.


In order to find the video stream, you'd have to parse the HTML and JS code of the video page to find the relevant initialization code, and then from there try to find the link to the actual MP4 file. It might be there in plaintext, but it could also be generated on the fly with some specific download tokens. Often, the JavaScript is obfuscated to make it harder to re-engineer it. Or the video information might be contained in an XML file that's loaded asynchronously by JS.


For HTML5 progressive download video, the actual source file is usually mentioned directly in the source child of the video tag, so if you'd search the page for mp4 or similar. For example on German news show Tagesschau 100, you'll find:




For more advanced playback technologies like MPEG DASH or Apple's HTTP Live Streaming (HLS), you have to parse a meta-information file to get the actual video stream. The meta file (.mpd for example in DASH, and .m3u8 for HLS) will contain links to segments of video and audio, which you'd later have to combine to get a playable file.


There's no general solution for this. It requires careful inspection and debugging of the target site.


No comments:

Post a Comment

Where does Skype save my contact's avatars in Linux?

I'm using Skype on Linux. Where can I find images cached by skype of my contact's avatars? Answer I wanted to get those Skype avat...