X-Git-Url: https://git.rapsys.eu/youtubedl/blobdiff_plain/b8d8e13c1f9e4d3cdd7d41c5c9d711a36dd5f9c3..ba6dee71ec16562c1960060bb7cd0aa7aad5252d:/README.txt?ds=inline diff --git a/README.txt b/README.txt index 24d4314..974ea6e 100644 --- a/README.txt +++ b/README.txt @@ -1,3 +1,5 @@ +[Build Status] + youtube-dl - download videos from youtube.com or other video platforms - INSTALLATION @@ -18,7 +20,7 @@ youtube-dl - download videos from youtube.com or other video platforms INSTALLATION -To install it right away for all UNIX users (Linux, OS X, etc.), type: +To install it right away for all UNIX users (Linux, macOS, etc.), type: sudo curl -L https://yt-dl.org/downloads/latest/youtube-dl -o /usr/local/bin/youtube-dl sudo chmod a+rx /usr/local/bin/youtube-dl @@ -39,7 +41,7 @@ You can also use pip: This command will update youtube-dl if you have already installed it. See the pypi page for more information. -OS X users can install youtube-dl with Homebrew: +macOS users can install youtube-dl with Homebrew: brew install youtube-dl @@ -59,9 +61,9 @@ DESCRIPTION YOUTUBE-DL is a command-line program to download videos from YouTube.com and a few more sites. It requires the Python interpreter, version 2.6, 2.7, or 3.2+, and it is not platform specific. It should work on your -Unix box, on Windows or on Mac OS X. It is released to the public -domain, which means you can modify it, redistribute it or use it however -you like. +Unix box, on Windows or on macOS. It is released to the public domain, +which means you can modify it, redistribute it or use it however you +like. youtube-dl [OPTIONS] URL [URL...] @@ -114,19 +116,33 @@ OPTIONS Network Options: --proxy URL Use the specified HTTP/HTTPS/SOCKS proxy. - To enable experimental SOCKS proxy, specify - a proper scheme. For example + To enable SOCKS proxy, specify a proper + scheme. For example socks5://127.0.0.1:1080/. Pass in an empty string (--proxy "") for direct connection --socket-timeout SECONDS Time to wait before giving up, in seconds --source-address IP Client-side IP address to bind to -4, --force-ipv4 Make all connections via IPv4 -6, --force-ipv6 Make all connections via IPv6 + + +Geo Restriction: + --geo-verification-proxy URL Use this proxy to verify the IP address for some geo-restricted sites. The default proxy specified by --proxy (or none, if the - options is not present) is used for the + option is not present) is used for the actual downloading. + --geo-bypass Bypass geographic restriction via faking + X-Forwarded-For HTTP header + --no-geo-bypass Do not bypass geographic restriction via + faking X-Forwarded-For HTTP header + --geo-bypass-country CODE Force bypass geographic restriction with + explicitly provided two-letter ISO 3166-2 + country code + --geo-bypass-ip-block IP_BLOCK Force bypass geographic restriction with + explicitly provided IP block in CIDR + notation Video Selection: @@ -160,22 +176,24 @@ Video Selection: --max-views COUNT Do not download any videos with more than COUNT views --match-filter FILTER Generic video filter. Specify any key (see - help for -o for a list of available keys) - to match if the key is present, !key to - check if the key is not present,key > - NUMBER (like "comment_count > 12", also - works with >=, <, <=, !=, =) to compare - against a number, and & to require multiple - matches. Values which are not known are - excluded unless you put a question mark (?) - after the operator.For example, to only - match videos that have been liked more than - 100 times and disliked less than 50 times - (or the dislike functionality is not - available at the given service), but who - also have a description, use --match-filter - "like_count > 100 & dislike_count NUMBER (like "comment_count + > 12", also works with >=, <, <=, !=, =) to + compare against a number, key = 'LITERAL' + (like "uploader = 'Mike Smith'", also works + with !=) to match against a string literal + and & to require multiple matches. Values + which are not known are excluded unless you + put a question mark (?) after the operator. + For example, to only match videos that have + been liked more than 100 times and disliked + less than 50 times (or the dislike + functionality is not available at the given + service), but who also have a description, + use --match-filter "like_count > 100 & + dislike_count .+?) - (?P.+)" --xattrs Write metadata to the video file's xattrs (using dublin core and xdg standards) --fixup POLICY Automatically correct known faults of the @@ -453,18 +474,18 @@ Post-processing Options: default; fix file if we can, warn otherwise) --prefer-avconv Prefer avconv over ffmpeg for running the - postprocessors (default) - --prefer-ffmpeg Prefer ffmpeg over avconv for running the postprocessors + --prefer-ffmpeg Prefer ffmpeg over avconv for running the + postprocessors (default) --ffmpeg-location PATH Location of the ffmpeg/avconv binary; either the path to the binary or its containing directory. --exec CMD Execute a command on the file after - downloading, similar to find's -exec - syntax. Example: --exec 'adb push {} - /sdcard/Music/ && rm {}' + downloading and post-processing, similar to + find's -exec syntax. Example: --exec 'adb + push {} /sdcard/Music/ && rm {}' --convert-subs FORMAT Convert the subtitles to other format - (currently supported: srt|ass|vtt) + (currently supported: srt|ass|vtt|lrc) @@ -472,7 +493,7 @@ CONFIGURATION You can configure youtube-dl by placing any supported command line -option to a configuration file. On Linux and OS X, the system wide +option to a configuration file. On Linux and macOS, the system wide configuration file is located at /etc/youtube-dl.conf and the user wide configuration file at ~/.config/youtube-dl/config. On Windows, the user wide configuration file locations are %APPDATA%\youtube-dl\config.txt or @@ -535,7 +556,9 @@ To activate authentication with the .netrc file you should pass --netrc to youtube-dl or place it in the configuration file. On Windows you may also need to setup the %HOME% environment variable -manually. +manually. For example: + + set HOME=%USERPROFILE% @@ -548,84 +571,114 @@ names. TL;DR: navigate me to examples. The basic usage is not to set any template arguments when downloading a -single file, like in youtube-dl -o funny_video.flv "http://some/video". +single file, like in youtube-dl -o funny_video.flv "https://some/video". However, it may contain special sequences that will be replaced when -downloading each video. The special sequences have the format %(NAME)s. -To clarify, that is a percent symbol followed by a name in parentheses, -followed by a lowercase S. Allowed names are: - -- id: Video identifier -- title: Video title -- url: Video URL -- ext: Video filename extension -- alt_title: A secondary title of the video -- display_id: An alternative identifier for the video -- uploader: Full name of the video uploader -- license: License name the video is licensed under -- creator: The creator of the video -- release_date: The date (YYYYMMDD) when the video was released -- timestamp: UNIX timestamp of the moment the video became available -- upload_date: Video upload date (YYYYMMDD) -- uploader_id: Nickname or id of the video uploader -- location: Physical location where the video was filmed -- duration: Length of the video in seconds -- view_count: How many users have watched the video on the platform -- like_count: Number of positive ratings of the video -- dislike_count: Number of negative ratings of the video -- repost_count: Number of reposts of the video -- average_rating: Average rating give by users, the scale used depends - on the webpage -- comment_count: Number of comments on the video -- age_limit: Age restriction for the video (years) -- format: A human-readable description of the format -- format_id: Format code specified by --format -- format_note: Additional info about the format -- width: Width of the video -- height: Height of the video -- resolution: Textual description of width and height -- tbr: Average bitrate of audio and video in KBit/s -- abr: Average audio bitrate in KBit/s -- acodec: Name of the audio codec in use -- asr: Audio sampling rate in Hertz -- vbr: Average video bitrate in KBit/s -- fps: Frame rate -- vcodec: Name of the video codec in use -- container: Name of the container format -- filesize: The number of bytes, if known in advance -- filesize_approx: An estimate for the number of bytes -- protocol: The protocol that will be used for the actual download -- extractor: Name of the extractor -- extractor_key: Key name of the extractor -- epoch: Unix epoch when creating the file -- autonumber: Five-digit number that will be increased with each - download, starting at zero -- playlist: Name or id of the playlist that contains the video -- playlist_index: Index of the video in the playlist padded with - leading zeros according to the total length of the playlist -- playlist_id: Playlist identifier -- playlist_title: Playlist title +downloading each video. The special sequences may be formatted according +to python string formatting operations. For example, %(NAME)s or +%(NAME)05d. To clarify, that is a percent symbol followed by a name in +parentheses, followed by formatting operations. Allowed names along with +sequence type are: + +- id (string): Video identifier +- title (string): Video title +- url (string): Video URL +- ext (string): Video filename extension +- alt_title (string): A secondary title of the video +- display_id (string): An alternative identifier for the video +- uploader (string): Full name of the video uploader +- license (string): License name the video is licensed under +- creator (string): The creator of the video +- release_date (string): The date (YYYYMMDD) when the video was + released +- timestamp (numeric): UNIX timestamp of the moment the video became + available +- upload_date (string): Video upload date (YYYYMMDD) +- uploader_id (string): Nickname or id of the video uploader +- channel (string): Full name of the channel the video is uploaded on +- channel_id (string): Id of the channel +- location (string): Physical location where the video was filmed +- duration (numeric): Length of the video in seconds +- view_count (numeric): How many users have watched the video on the + platform +- like_count (numeric): Number of positive ratings of the video +- dislike_count (numeric): Number of negative ratings of the video +- repost_count (numeric): Number of reposts of the video +- average_rating (numeric): Average rating give by users, the scale + used depends on the webpage +- comment_count (numeric): Number of comments on the video +- age_limit (numeric): Age restriction for the video (years) +- is_live (boolean): Whether this video is a live stream or a + fixed-length video +- start_time (numeric): Time in seconds where the reproduction should + start, as specified in the URL +- end_time (numeric): Time in seconds where the reproduction should + end, as specified in the URL +- format (string): A human-readable description of the format +- format_id (string): Format code specified by --format +- format_note (string): Additional info about the format +- width (numeric): Width of the video +- height (numeric): Height of the video +- resolution (string): Textual description of width and height +- tbr (numeric): Average bitrate of audio and video in KBit/s +- abr (numeric): Average audio bitrate in KBit/s +- acodec (string): Name of the audio codec in use +- asr (numeric): Audio sampling rate in Hertz +- vbr (numeric): Average video bitrate in KBit/s +- fps (numeric): Frame rate +- vcodec (string): Name of the video codec in use +- container (string): Name of the container format +- filesize (numeric): The number of bytes, if known in advance +- filesize_approx (numeric): An estimate for the number of bytes +- protocol (string): The protocol that will be used for the actual + download +- extractor (string): Name of the extractor +- extractor_key (string): Key name of the extractor +- epoch (numeric): Unix epoch when creating the file +- autonumber (numeric): Five-digit number that will be increased with + each download, starting at zero +- playlist (string): Name or id of the playlist that contains the + video +- playlist_index (numeric): Index of the video in the playlist padded + with leading zeros according to the total length of the playlist +- playlist_id (string): Playlist identifier +- playlist_title (string): Playlist title +- playlist_uploader (string): Full name of the playlist uploader +- playlist_uploader_id (string): Nickname or id of the playlist + uploader Available for the video that belongs to some logical chapter or section: -- chapter: Name or title of the chapter the video belongs to - -chapter_number: Number of the chapter the video belongs to - chapter_id: -Id of the chapter the video belongs to + +- chapter (string): Name or title of the chapter the video belongs to +- chapter_number (numeric): Number of the chapter the video belongs to +- chapter_id (string): Id of the chapter the video belongs to Available for the video that is an episode of some series or programme: -- series: Title of the series or programme the video episode belongs to -- season: Title of the season the video episode belongs to - -season_number: Number of the season the video episode belongs to - -season_id: Id of the season the video episode belongs to - episode: -Title of the video episode - episode_number: Number of the video episode -within a season - episode_id: Id of the video episode - -Available for the media that is a track or a part of a music album: - -track: Title of the track - track_number: Number of the track within an -album or a disc - track_id: Id of the track - artist: Artist(s) of the -track - genre: Genre(s) of the track - album: Title of the album the -track belongs to - album_type: Type of the album - album_artist: List of -all artists appeared on the album - disc_number: Number of the disc or -other physical medium the track belongs to - release_year: Year (YYYY) -when the album was released + +- series (string): Title of the series or programme the video episode + belongs to +- season (string): Title of the season the video episode belongs to +- season_number (numeric): Number of the season the video episode + belongs to +- season_id (string): Id of the season the video episode belongs to +- episode (string): Title of the video episode +- episode_number (numeric): Number of the video episode within a + season +- episode_id (string): Id of the video episode + +Available for the media that is a track or a part of a music album: + +- track (string): Title of the track +- track_number (numeric): Number of the track within an album or a + disc +- track_id (string): Id of the track +- artist (string): Artist(s) of the track +- genre (string): Genre(s) of the track +- album (string): Title of the album the track belongs to +- album_type (string): Type of the album +- album_artist (string): List of all artists appeared on the album +- disc_number (numeric): Number of the disc or other physical medium + the track belongs to +- release_year (numeric): Year (YYYY) when the album was released Each aforementioned sequence when referenced in an output template will be replaced by the actual value corresponding to the sequence name. Note @@ -638,6 +691,10 @@ youtube-dl test video and id BaW_jenozKcj, this will result in a youtube-dl test video-BaW_jenozKcj.mp4 file created in the current directory. +For numeric sequences you can use numeric related formatting, for +example, %(view_count)05d will result in a string with view count padded +with zeros up to 5 characters, like in 00042. + Output templates can also contain arbitrary hierarchical path, e.g. -o '%(playlist)s/%(playlist_index)s - %(title)s.%(ext)s' which will result in downloading each video in a directory corresponding to this @@ -665,7 +722,8 @@ should stay intact: -o "C:\%HOMEPATH%\Desktop\%%(title)s.%%(ext)s". Output template examples -Note on Windows you may need to use double quotes instead of single. +Note that on Windows you may need to use double quotes instead of +single. $ youtube-dl --get-filename -o '%(title)s.%(ext)s' BaW_jenozKc youtube-dl test video ''_ä↭𝕐.mp4 # All kinds of weird characters @@ -683,7 +741,7 @@ Note on Windows you may need to use double quotes instead of single. $ youtube-dl -u user -p password -o '~/MyVideos/%(playlist)s/%(chapter_number)s - %(chapter)s/%(title)s.%(ext)s' https://www.udemy.com/java-tutorial/ # Download entire series season keeping each series and each season in separate directory under C:/MyVideos - $ youtube-dl -o "C:/MyVideos/%(series)s/%(season_number)s - %(season)s/%(episode_number)s - %(episode)s.%(ext)s" http://videomore.ru/kino_v_detalayah/5_sezon/367617 + $ youtube-dl -o "C:/MyVideos/%(series)s/%(season_number)s - %(season)s/%(episode_number)s - %(episode)s.%(ext)s" https://videomore.ru/kino_v_detalayah/5_sezon/367617 # Stream the video being downloaded to stdout $ youtube-dl -o - BaW_jenozKc @@ -721,15 +779,20 @@ of a particular file extension served as a single file, e.g. -f webm will download the best quality format with the webm extension served as a single file. -You can also use special names to select particular edge case formats: - -best: Select the best quality format represented by a single file with -video and audio. - worst: Select the worst quality format represented by -a single file with video and audio. - bestvideo: Select the best quality -video-only format (e.g. DASH video). May not be available. - worstvideo: -Select the worst quality video-only format. May not be available. - -bestaudio: Select the best quality audio only-format. May not be -available. - worstaudio: Select the worst quality audio only-format. May -not be available. +You can also use special names to select particular edge case formats: + +- best: Select the best quality format represented by a single file + with video and audio. +- worst: Select the worst quality format represented by a single file + with video and audio. +- bestvideo: Select the best quality video-only format (e.g. DASH + video). May not be available. +- worstvideo: Select the worst quality video-only format. May not be + available. +- bestaudio: Select the best quality audio only-format. May not be + available. +- worstaudio: Select the worst quality audio only-format. May not be + available. For example, to download the worst quality video-only format you can use -f worstvideo. @@ -752,20 +815,31 @@ You can also filter the video formats by putting a condition in brackets, as in -f "best[height=720]" (or -f "[filesize>10M]"). The following numeric meta fields can be used with comparisons <, <=, >, ->=, = (equals), != (not equals): - filesize: The number of bytes, if -known in advance - width: Width of the video, if known - height: Height -of the video, if known - tbr: Average bitrate of audio and video in -KBit/s - abr: Average audio bitrate in KBit/s - vbr: Average video -bitrate in KBit/s - asr: Audio sampling rate in Hertz - fps: Frame rate - -Also filtering work for comparisons = (equals), != (not equals), ^= -(begins with), $= (ends with), *= (contains) and following string meta -fields: - ext: File extension - acodec: Name of the audio codec in use - -vcodec: Name of the video codec in use - container: Name of the -container format - protocol: The protocol that will be used for the -actual download, lower-case (http, https, rtsp, rtmp, rtmpe, mms, f4m, -ism, m3u8, or m3u8_native) - format_id: A short description of the -format +>=, = (equals), != (not equals): + +- filesize: The number of bytes, if known in advance +- width: Width of the video, if known +- height: Height of the video, if known +- tbr: Average bitrate of audio and video in KBit/s +- abr: Average audio bitrate in KBit/s +- vbr: Average video bitrate in KBit/s +- asr: Audio sampling rate in Hertz +- fps: Frame rate + +Also filtering work for comparisons = (equals), ^= (starts with), $= +(ends with), *= (contains) and following string meta fields: + +- ext: File extension +- acodec: Name of the audio codec in use +- vcodec: Name of the video codec in use +- container: Name of the container format +- protocol: The protocol that will be used for the actual download, + lower-case (http, https, rtsp, rtmp, rtmpe, mms, f4m, ism, + http_dash_segments, m3u8, or m3u8_native) +- format_id: A short description of the format + +Any string comparison may be prefixed with negation ! in order to +produce an opposite comparison, e.g. !*= (does not contain). Note that none of the aforementioned meta fields are guaranteed to be present since this solely depends on the metadata obtained by particular @@ -812,12 +886,13 @@ file in order not to type it every time you run youtube-dl. Format selection examples -Note on Windows you may need to use double quotes instead of single. +Note that on Windows you may need to use double quotes instead of +single. # Download best mp4 format available or any other best if no mp4 available $ youtube-dl -f 'bestvideo[ext=mp4]+bestaudio[ext=m4a]/best[ext=mp4]/best' - # Download best format available but not better that 480p + # Download best format available but no better than 480p $ youtube-dl -f 'bestvideo[height<=480]+bestaudio/best[height<=480]' # Download best video only format but no bigger than 50 MB @@ -872,7 +947,7 @@ If you have installed youtube-dl using a package manager like _apt-get_ or _yum_, use the standard system update mechanism to update. Note that distribution packages are often outdated. As a rule of thumb, youtube-dl releases at least once a month, and often weekly or even daily. Simply -go to http://yt-dl.org/ to find out the current version. Unfortunately, +go to https://yt-dl.org to find out the current version. Unfortunately, there is nothing we youtube-dl developers can do if your distribution serves a really outdated version. You can (and should) complain to your distribution in their bugtracker or support forum. @@ -885,8 +960,8 @@ that, remove the distribution's package, with a line like Afterwards, simply follow our manual installation instructions: - sudo wget https://yt-dl.org/latest/youtube-dl -O /usr/local/bin/youtube-dl - sudo chmod a+x /usr/local/bin/youtube-dl + sudo wget https://yt-dl.org/downloads/latest/youtube-dl -O /usr/local/bin/youtube-dl + sudo chmod a+rx /usr/local/bin/youtube-dl hash -r Again, from then on you'll be able to update with sudo youtube-dl -U. @@ -1026,10 +1101,19 @@ above for how to update youtube-dl. HTTP Error 429: Too Many Requests or 402: Payment Required These two error codes indicate that the service is blocking your IP -address because of overuse. Contact the service and ask them to unblock -your IP address, or - if you have acquired a whitelisted IP address -already - use the --proxy or --source-address options to select another -IP address. +address because of overuse. Usually this is a soft block meaning that +you can gain access again after solving CAPTCHA. Just open a browser and +solve a CAPTCHA the service suggests you and after that pass cookies to +youtube-dl. Note that if your machine has multiple external IPs then you +should also pass exactly the same IP you've used for solving CAPTCHA +with --source-address. Also you may need to pass a User-Agent HTTP +header of your browser with --user-agent. + +If this is not the case (no CAPTCHA suggested to solve by the service) +then you can contact the service and ask them to unblock your IP +address, or - if you have acquired a whitelisted IP address already - +use the --proxy or --source-address options to select another IP +address. SyntaxError: Non-ASCII character @@ -1077,11 +1161,11 @@ all of your downloads, put the option into your configuration file. How do I download a video starting with a -? -Either prepend http://www.youtube.com/watch?v= or separate the ID from +Either prepend https://www.youtube.com/watch?v= or separate the ID from the options with --: youtube-dl -- -wNyEUrxzFU - youtube-dl "http://www.youtube.com/watch?v=-wNyEUrxzFU" + youtube-dl "https://www.youtube.com/watch?v=-wNyEUrxzFU" How do I pass cookies to youtube-dl? @@ -1090,15 +1174,15 @@ Use the --cookies option, for example In order to extract cookies from browser use any conforming browser extension for exporting cookies. For example, cookies.txt (for Chrome) -or Export Cookies (for Firefox). +or cookies.txt (for Firefox). Note that the cookies file must be in Mozilla/Netscape format and the first line of the cookies file must be either # HTTP Cookie File or # Netscape HTTP Cookie File. Make sure you have correct newline format in the cookies file and convert newlines if necessary to correspond with your OS, namely CRLF (\r\n) for Windows and LF (\n) for Unix and -Unix-like systems (Linux, Mac OS, etc.). HTTP Error 400: Bad Request -when using --cookies is a good sign of invalid newline format. +Unix-like systems (Linux, macOS, etc.). HTTP Error 400: Bad Request when +using --cookies is a good sign of invalid newline format. Passing cookies to youtube-dl is a good way to workaround login when a particular extractor does not implement it explicitly. Another use case @@ -1112,7 +1196,7 @@ You will first need to tell youtube-dl to stream media to stdout with capable of this for streaming) and then pipe former to latter. For example, streaming to vlc can be achieved with: - youtube-dl -o - "http://www.youtube.com/watch?v=BaW_jenozKcj" | vlc - + youtube-dl -o - "https://www.youtube.com/watch?v=BaW_jenozKcj" | vlc - How do I download only new videos from a playlist? @@ -1209,7 +1293,7 @@ How can I detect whether a given URL is supported by youtube-dl? For one, have a look at the list of supported sites. Note that it can sometimes happen that the site changes its URL scheme (say, from -http://example.com/video/1234567 to http://example.com/v/1234567 ) and +https://example.com/video/1234567 to https://example.com/v/1234567 ) and youtube-dl reports an URL of a service in that list as unsupported. In that case, simply report a bug. @@ -1273,6 +1357,9 @@ test file directly; any of the following work: python test/test_download.py nosetests +See item 6 of new extractor tutorial for how to run extractor specific +test cases. + If you want to create a build of youtube-dl yourself, you'll need - python @@ -1295,12 +1382,12 @@ yourextractor): 1. Fork this repository 2. Check out the source code with: - git clone git@github.com:YOUR_GITHUB_USERNAME/youtube-dl.git + git clone git@github.com:YOUR_GITHUB_USERNAME/youtube-dl.git 3. Start a new git branch with - cd youtube-dl - git checkout -b yourextractor + cd youtube-dl + git checkout -b yourextractor 4. Start with this simple template and save it to youtube_dl/extractor/yourextractor.py: @@ -1314,7 +1401,7 @@ yourextractor): class YourExtractorIE(InfoExtractor): _VALID_URL = r'https?://(?:www\.)?yourextractor\.com/watch/(?P<id>[0-9]+)' _TEST = { - 'url': 'http://yourextractor.com/watch/42', + 'url': 'https://yourextractor.com/watch/42', 'md5': 'TODO: md5 sum of the first 10241 bytes of the video file (use --test)', 'info_dict': { 'id': '42', @@ -1351,15 +1438,19 @@ yourextractor): _TEST to _TESTS and make it into a list of dictionaries. The tests will then be named TestDownload.test_YourExtractor, TestDownload.test_YourExtractor_1, - TestDownload.test_YourExtractor_2, etc. + TestDownload.test_YourExtractor_2, etc. Note that tests with + only_matching key in test's dict are not counted in. 7. Have a look at youtube_dl/extractor/common.py for possible helper methods and a detailed description of what your extractor should and may return. Add tests and code for as many as you want. 8. Make sure your code follows youtube-dl coding conventions and check - the code with flake8. Also make sure your code works under all - Python versions claimed supported by youtube-dl, namely 2.6, 2.7, - and 3.2+. -9. When the tests pass, add the new files and commit them and push the + the code with flake8: + + $ flake8 youtube_dl/extractor/yourextractor.py + +9. Make sure your code works under all Python versions claimed + supported by youtube-dl, namely 2.6, 2.7, and 3.2+. +10. When the tests pass, add the new files and commit them and push the result, like this: $ git add youtube_dl/extractor/extractors.py @@ -1367,7 +1458,7 @@ yourextractor): $ git commit -m '[yourextractor] Add new extractor' $ git push origin yourextractor -10. Finally, create a pull request. We'll then review and merge it. +11. Finally, create a pull request. We'll then review and merge it. In any case, thank you very much for your contributions! @@ -1496,9 +1587,31 @@ fallback scenario: This code will try to extract from meta first and if it fails it will try extracting og:title from a webpage. -Make regular expressions flexible +Regular expressions + +Don't capture groups you don't use + +Capturing group must be an indication that it's used somewhere in the +code. Any group that is not used must be non capturing. + +Example + +Don't capture id attribute name here since you can't use it for anything +anyway. + +Correct: -When using regular expressions try to write them fuzzy and flexible. + r'(?:id|ID)=(?P<id>\d+)' + +Incorrect: + + r'(id|ID)=(?P<id>\d+)' + +Make regular expressions relaxed and flexible + +When using regular expressions try to write them fuzzy, relaxed and +flexible, skipping insignificant parts that are more likely to change, +allowing both single and double quotes for quoted values and so on. Example @@ -1526,11 +1639,113 @@ The code definitely should not look like: r'<span style="position: absolute; left: 910px; width: 90px; float: right; z-index: 9999;" class="title">(.*?)</span>', webpage, 'title', group='title') -Use safe conversion functions +Long lines policy + +There is a soft limit to keep lines of code under 80 characters long. +This means it should be respected if possible and if it does not make +readability and code maintenance worse. + +For example, you should NEVER split long string literals like URLs or +some other often copied entities over multiple lines to fit this limit: + +Correct: + + 'https://www.youtube.com/watch?v=FqZTN594JQw&list=PLMYEtVRpaqY00V9W81Cwmzp6N6vZqfUKD4' + +Incorrect: + + 'https://www.youtube.com/watch?v=FqZTN594JQw&list=' + 'PLMYEtVRpaqY00V9W81Cwmzp6N6vZqfUKD4' + +Inline values + +Extracting variables is acceptable for reducing code duplication and +improving readability of complex expressions. However, you should avoid +extracting variables used only once and moving them to opposite parts of +the extractor file, which makes reading the linear flow difficult. + +Example + +Correct: + + title = self._html_search_regex(r'<title>([^<]+)', webpage, 'title') + +Incorrect: + + TITLE_RE = r'([^<]+)' + # ...some lines of code... + title = self._html_search_regex(TITLE_RE, webpage, 'title') + +Collapse fallbacks + +Multiple fallback values can quickly become unwieldy. Collapse multiple +fallback values into a single expression via a list of patterns. + +Example + +Good: + + description = self._html_search_meta( + ['og:description', 'description', 'twitter:description'], + webpage, 'description', default=None) + +Unwieldy: + + description = ( + self._og_search_description(webpage, default=None) + or self._html_search_meta('description', webpage, default=None) + or self._html_search_meta('twitter:description', webpage, default=None)) + +Methods supporting list of patterns are: _search_regex, +_html_search_regex, _og_search_property, _html_search_meta. + +Trailing parentheses + +Always move trailing parentheses after the last argument. + +Example + +Correct: + + lambda x: x['ResultSet']['Result'][0]['VideoUrlSet']['VideoUrl'], + list) + +Incorrect: + + lambda x: x['ResultSet']['Result'][0]['VideoUrlSet']['VideoUrl'], + list, + ) + +Use convenience conversion and parsing functions + +Wrap all extracted numeric data into safe functions from +youtube_dl/utils.py: int_or_none, float_or_none. Use them for string to +number conversions as well. + +Use url_or_none for safe URL processing. + +Use try_get for safe metadata extraction from parsed JSON. + +Use unified_strdate for uniform upload_date or any YYYYMMDD meta field +extraction, unified_timestamp for uniform timestamp extraction, +parse_filesize for filesize extraction, parse_count for count meta +fields extraction, parse_resolution, parse_duration for duration +extraction, parse_age_limit for age_limit extraction. + +Explore youtube_dl/utils.py for more useful convenience functions. + +More examples + +Safely extract optional description from parsed JSON + + description = try_get(response, lambda x: x['result']['video'][0]['summary'], compat_str) + +Safely extract more optional metadata -Wrap all extracted numeric data into safe functions from utils: -int_or_none, float_or_none. Use them for string to number conversions as -well. + video = try_get(response, lambda x: x['result']['video'][0], dict) or {} + description = video.get('summary') + duration = float_or_none(video.get('durationMs'), scale=1000) + view_count = int_or_none(video.get('views')) @@ -1549,7 +1764,7 @@ fashion, like this: ydl_opts = {} with youtube_dl.YoutubeDL(ydl_opts) as ydl: - ydl.download(['http://www.youtube.com/watch?v=BaW_jenozKc']) + ydl.download(['https://www.youtube.com/watch?v=BaW_jenozKc']) Most likely, you'll want to use various options. For a list of options available, have a look at youtube_dl/YoutubeDL.py. For a start, if you @@ -1590,7 +1805,7 @@ downloads/converts the video to an mp3 file: 'progress_hooks': [my_hook], } with youtube_dl.YoutubeDL(ydl_opts) as ydl: - ydl.download(['http://www.youtube.com/watch?v=BaW_jenozKc']) + ydl.download(['https://www.youtube.com/watch?v=BaW_jenozKc']) @@ -1598,9 +1813,9 @@ BUGS Bugs and suggestions should be reported at: -https://github.com/rg3/youtube-dl/issues. Unless you were prompted to or -there is another pertinent reason (e.g. GitHub fails to accept the bug -report), please do not send bug reports via personal email. For +https://github.com/ytdl-org/youtube-dl/issues. Unless you were prompted +to or there is another pertinent reason (e.g. GitHub fails to accept the +bug report), please do not send bug reports via personal email. For discussions, join us in the IRC channel #youtube-dl on freenode (webchat). @@ -1612,7 +1827,7 @@ to this: $ youtube-dl -v [debug] System config: [] [debug] User config: [] - [debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj'] + [debug] Command-line args: [u'-v', u'https://www.youtube.com/watch?v=BaW_jenozKcj'] [debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251 [debug] youtube-dl version 2015.12.06 [debug] Git HEAD: 135392e @@ -1667,9 +1882,9 @@ command-line) or upload the .dump files you get when you add SITE SUPPORT REQUESTS MUST CONTAIN AN EXAMPLE URL. An example URL is a URL you might want to download, like -http://www.youtube.com/watch?v=BaW_jenozKc. There should be an obvious +https://www.youtube.com/watch?v=BaW_jenozKc. There should be an obvious video present. Except under very special circumstances, the main page of -a video service (e.g. http://www.youtube.com/) is _not_ an example URL. +a video service (e.g. https://www.youtube.com/) is _not_ an example URL. Are you using the latest version?