X-Git-Url: https://git.rapsys.eu/youtubedl/blobdiff_plain/e76f531201cd41dfc0ce00be28bcc5c575c7acc5..f1bb69fb5c709917a78f1bb6019781e1f76f40ae:/youtube-dl.1 diff --git a/youtube-dl.1 b/youtube-dl.1 index 78efa3d..9ab22b0 100644 --- a/youtube-dl.1 +++ b/youtube-dl.1 @@ -7,8 +7,8 @@ youtube\-dl \- download videos from youtube.com or other video platforms \f[B]youtube\-dl\f[] [OPTIONS] URL [URL...] .SH DESCRIPTION .PP -\f[B]youtube\-dl\f[] is a small command\-line program to download videos -from YouTube.com and a few more sites. +\f[B]youtube\-dl\f[] is a command\-line program to download videos from +YouTube.com and a few more sites. It requires the Python interpreter, version 2.6, 2.7, or 3.2+, and it is not platform specific. It should work on your Unix box, on Windows or on Mac OS X. @@ -78,17 +78,33 @@ if this is not possible instead of searching. .TP .B \-\-ignore\-config Do not read configuration files. -When given in the global configuration file /etc /youtube\-dl.conf: Do +When given in the global configuration file /etc/youtube\-dl.conf: Do not read the user configuration in ~/.config/youtube\- dl/config (%APPDATA%/youtube\-dl/config.txt on Windows) .RS .RE .TP +.B \-\-config\-location \f[I]PATH\f[] +Location of the configuration file; either the path to the config or its +containing directory. +.RS +.RE +.TP .B \-\-flat\-playlist Do not extract the videos of a playlist, only list them. .RS .RE .TP +.B \-\-mark\-watched +Mark videos watched (YouTube only) +.RS +.RE +.TP +.B \-\-no\-mark\-watched +Do not mark videos watched (YouTube only) +.RS +.RE +.TP .B \-\-no\-color Do not emit color codes in output .RS @@ -96,7 +112,9 @@ Do not emit color codes in output .SS Network Options: .TP .B \-\-proxy \f[I]URL\f[] -Use the specified HTTP/HTTPS proxy. +Use the specified HTTP/HTTPS/SOCKS proxy. +To enable experimental SOCKS proxy, specify a proper scheme. +For example socks5://127.0.0.1:1080/. Pass in an empty string (\-\-proxy "") for direct connection .RS .RE @@ -107,27 +125,45 @@ Time to wait before giving up, in seconds .RE .TP .B \-\-source\-address \f[I]IP\f[] -Client\-side IP address to bind to (experimental) +Client\-side IP address to bind to .RS .RE .TP .B \-4, \-\-force\-ipv4 -Make all connections via IPv4 (experimental) +Make all connections via IPv4 .RS .RE .TP .B \-6, \-\-force\-ipv6 -Make all connections via IPv6 (experimental) +Make all connections via IPv6 .RS .RE +.SS Geo Restriction: .TP -.B \-\-cn\-verification\-proxy \f[I]URL\f[] -Use this proxy to verify the IP address for some Chinese sites. +.B \-\-geo\-verification\-proxy \f[I]URL\f[] +Use this proxy to verify the IP address for some geo\-restricted sites. The default proxy specified by \-\-proxy (or none, if the options is not present) is used for the actual downloading. +.RS +.RE +.TP +.B \-\-geo\-bypass +Bypass geographic restriction via faking X\-Forwarded\-For HTTP header (experimental) .RS .RE +.TP +.B \-\-no\-geo\-bypass +Do not bypass geographic restriction via faking X\-Forwarded\-For HTTP +header (experimental) +.RS +.RE +.TP +.B \-\-geo\-bypass\-country \f[I]CODE\f[] +Force bypass geographic restriction with explicitly provided two\-letter +ISO 3166\-2 country code (experimental) +.RS +.RE .SS Video Selection: .TP .B \-\-playlist\-start \f[I]NUMBER\f[] @@ -205,17 +241,19 @@ Do not download any videos with more than COUNT views .RE .TP .B \-\-match\-filter \f[I]FILTER\f[] -Generic video filter (experimental). -Specify any key (see help for \-o for a list of available keys) to match -if the key is present, !key to check if the key is not present,key > -NUMBER (like "comment_count > 12", also works with >=, <, <=, !=, =) to -compare against a number, and & to require multiple matches. +Generic video filter. +Specify any key (see the "OUTPUT TEMPLATE" for a list of available keys) +to match if the key is present, !key to check if the key is not present, +key > NUMBER (like "comment_count > 12", also works with >=, <, <=, !=, +=) to compare against a number, key = \[aq]LITERAL\[aq] (like "uploader += \[aq]Mike Smith\[aq]", also works with !=) to match against a string +literal and & to require multiple matches. Values which are not known are excluded unless you put a question mark -(?) after the operator.For example, to only match videos that have been -liked more than 100 times and disliked less than 50 times (or the -dislike functionality is not available at the given service), but who -also have a description, use \-\-match\-filter "like_count > 100 & -dislike_count 100 & dislike_count \\youtube\-dl.conf\f[]. +Note that by default configuration file may not exist so you may need to +create it yourself. +.PP For example, with the following configuration file youtube\-dl will -always extract the audio, not copy the mtime and use a proxy: +always extract the audio, not copy the mtime, use a proxy and save all +videos under \f[C]Movies\f[] directory in your home directory: .IP .nf \f[C] -\-\-extract\-audio +#\ Lines\ starting\ with\ #\ are\ comments + +#\ Always\ extract\ audio +\-x + +#\ Do\ not\ copy\ the\ mtime \-\-no\-mtime + +#\ Use\ this\ proxy \-\-proxy\ 127.0.0.1:3128 + +#\ Save\ all\ videos\ under\ Movies\ directory\ in\ your\ home\ directory +\-o\ ~/Movies/%(title)s.%(ext)s \f[] .fi .PP +Note that options in configuration file are just the same options aka +switches used in regular command line calls thus there \f[B]must be no +whitespace\f[] after \f[C]\-\f[] or \f[C]\-\-\f[], e.g. +\f[C]\-o\f[] or \f[C]\-\-proxy\f[] but not \f[C]\-\ o\f[] or +\f[C]\-\-\ proxy\f[]. +.PP You can use \f[C]\-\-ignore\-config\f[] if you want to disable the configuration file for a particular youtube\-dl run. +.PP +You can also use \f[C]\-\-config\-location\f[] if you want to use custom +configuration file for a particular youtube\-dl run. .SS Authentication with \f[C]\&.netrc\f[] file .PP You may also want to configure automatic credentials storage for @@ -852,9 +953,10 @@ pass credentials as command line arguments on every youtube\-dl execution and prevent tracking plain text passwords in the shell command history. You can achieve this using a \f[C]\&.netrc\f[] -file (http://stackoverflow.com/tags/.netrc/info) on per extractor basis. -For that you will need to create a\f[C]\&.netrc\f[] file in your -\f[C]$HOME\f[] and restrict permissions to read/write by you only: +file (https://stackoverflow.com/tags/.netrc/info) on a per extractor +basis. +For that you will need to create a \f[C]\&.netrc\f[] file in your +\f[C]$HOME\f[] and restrict permissions to read/write by only you: .IP .nf \f[C] @@ -863,8 +965,9 @@ chmod\ a\-rwx,u+rw\ $HOME/.netrc \f[] .fi .PP -After that you can add credentials for extractor in the following -format, where \f[I]extractor\f[] is the name of extractor in lowercase: +After that you can add credentials for an extractor in the following +format, where \f[I]extractor\f[] is the name of the extractor in +lowercase: .IP .nf \f[C] @@ -887,53 +990,217 @@ file (#configuration). .PP On Windows you may also need to setup the \f[C]%HOME%\f[] environment variable manually. +For example: +.IP +.nf +\f[C] +set\ HOME=%USERPROFILE% +\f[] +.fi .SH OUTPUT TEMPLATE .PP The \f[C]\-o\f[] option allows users to indicate a template for the output file names. +.PP +\f[B]tl;dr:\f[] navigate me to examples (#output-template-examples). +.PP The basic usage is not to set any template arguments when downloading a single file, like in -\f[C]youtube\-dl\ \-o\ funny_video.flv\ "http://some/video"\f[]. +\f[C]youtube\-dl\ \-o\ funny_video.flv\ "https://some/video"\f[]. However, it may contain special sequences that will be replaced when downloading each video. -The special sequences have the format \f[C]%(NAME)s\f[]. +The special sequences may be formatted according to python string +formatting +operations (https://docs.python.org/2/library/stdtypes.html#string-formatting). +For example, \f[C]%(NAME)s\f[] or \f[C]%(NAME)05d\f[]. To clarify, that is a percent symbol followed by a name in parentheses, -followed by a lowercase S. -Allowed names are: +followed by a formatting operations. +Allowed names along with sequence type are: +.IP \[bu] 2 +\f[C]id\f[] (string): Video identifier +.IP \[bu] 2 +\f[C]title\f[] (string): Video title +.IP \[bu] 2 +\f[C]url\f[] (string): Video URL +.IP \[bu] 2 +\f[C]ext\f[] (string): Video filename extension +.IP \[bu] 2 +\f[C]alt_title\f[] (string): A secondary title of the video +.IP \[bu] 2 +\f[C]display_id\f[] (string): An alternative identifier for the video +.IP \[bu] 2 +\f[C]uploader\f[] (string): Full name of the video uploader +.IP \[bu] 2 +\f[C]license\f[] (string): License name the video is licensed under +.IP \[bu] 2 +\f[C]creator\f[] (string): The creator of the video +.IP \[bu] 2 +\f[C]release_date\f[] (string): The date (YYYYMMDD) when the video was +released +.IP \[bu] 2 +\f[C]timestamp\f[] (numeric): UNIX timestamp of the moment the video +became available +.IP \[bu] 2 +\f[C]upload_date\f[] (string): Video upload date (YYYYMMDD) +.IP \[bu] 2 +\f[C]uploader_id\f[] (string): Nickname or id of the video uploader +.IP \[bu] 2 +\f[C]location\f[] (string): Physical location where the video was filmed +.IP \[bu] 2 +\f[C]duration\f[] (numeric): Length of the video in seconds +.IP \[bu] 2 +\f[C]view_count\f[] (numeric): How many users have watched the video on +the platform +.IP \[bu] 2 +\f[C]like_count\f[] (numeric): Number of positive ratings of the video +.IP \[bu] 2 +\f[C]dislike_count\f[] (numeric): Number of negative ratings of the +video +.IP \[bu] 2 +\f[C]repost_count\f[] (numeric): Number of reposts of the video +.IP \[bu] 2 +\f[C]average_rating\f[] (numeric): Average rating give by users, the +scale used depends on the webpage +.IP \[bu] 2 +\f[C]comment_count\f[] (numeric): Number of comments on the video +.IP \[bu] 2 +\f[C]age_limit\f[] (numeric): Age restriction for the video (years) +.IP \[bu] 2 +\f[C]format\f[] (string): A human\-readable description of the format +.IP \[bu] 2 +\f[C]format_id\f[] (string): Format code specified by +\f[C]\-\-format\f[] +.IP \[bu] 2 +\f[C]format_note\f[] (string): Additional info about the format +.IP \[bu] 2 +\f[C]width\f[] (numeric): Width of the video +.IP \[bu] 2 +\f[C]height\f[] (numeric): Height of the video +.IP \[bu] 2 +\f[C]resolution\f[] (string): Textual description of width and height +.IP \[bu] 2 +\f[C]tbr\f[] (numeric): Average bitrate of audio and video in KBit/s +.IP \[bu] 2 +\f[C]abr\f[] (numeric): Average audio bitrate in KBit/s .IP \[bu] 2 -\f[C]id\f[]: The sequence will be replaced by the video identifier. +\f[C]acodec\f[] (string): Name of the audio codec in use .IP \[bu] 2 -\f[C]url\f[]: The sequence will be replaced by the video URL. +\f[C]asr\f[] (numeric): Audio sampling rate in Hertz .IP \[bu] 2 -\f[C]uploader\f[]: The sequence will be replaced by the nickname of the -person who uploaded the video. +\f[C]vbr\f[] (numeric): Average video bitrate in KBit/s .IP \[bu] 2 -\f[C]upload_date\f[]: The sequence will be replaced by the upload date -in YYYYMMDD format. +\f[C]fps\f[] (numeric): Frame rate .IP \[bu] 2 -\f[C]title\f[]: The sequence will be replaced by the video title. +\f[C]vcodec\f[] (string): Name of the video codec in use .IP \[bu] 2 -\f[C]ext\f[]: The sequence will be replaced by the appropriate extension -(like flv or mp4). +\f[C]container\f[] (string): Name of the container format .IP \[bu] 2 -\f[C]epoch\f[]: The sequence will be replaced by the Unix epoch when -creating the file. +\f[C]filesize\f[] (numeric): The number of bytes, if known in advance .IP \[bu] 2 -\f[C]autonumber\f[]: The sequence will be replaced by a five\-digit -number that will be increased with each download, starting at zero. +\f[C]filesize_approx\f[] (numeric): An estimate for the number of bytes .IP \[bu] 2 -\f[C]playlist\f[]: The sequence will be replaced by the name or the id -of the playlist that contains the video. +\f[C]protocol\f[] (string): The protocol that will be used for the +actual download .IP \[bu] 2 -\f[C]playlist_index\f[]: The sequence will be replaced by the index of -the video in the playlist padded with leading zeros according to the -total length of the playlist. +\f[C]extractor\f[] (string): Name of the extractor .IP \[bu] 2 -\f[C]format_id\f[]: The sequence will be replaced by the format code -specified by \f[C]\-\-format\f[]. +\f[C]extractor_key\f[] (string): Key name of the extractor .IP \[bu] 2 -\f[C]duration\f[]: The sequence will be replaced by the length of the -video in seconds. +\f[C]epoch\f[] (numeric): Unix epoch when creating the file +.IP \[bu] 2 +\f[C]autonumber\f[] (numeric): Five\-digit number that will be increased +with each download, starting at zero +.IP \[bu] 2 +\f[C]playlist\f[] (string): Name or id of the playlist that contains the +video +.IP \[bu] 2 +\f[C]playlist_index\f[] (numeric): Index of the video in the playlist +padded with leading zeros according to the total length of the playlist +.IP \[bu] 2 +\f[C]playlist_id\f[] (string): Playlist identifier +.IP \[bu] 2 +\f[C]playlist_title\f[] (string): Playlist title +.PP +Available for the video that belongs to some logical chapter or section: +.IP \[bu] 2 +\f[C]chapter\f[] (string): Name or title of the chapter the video +belongs to +.IP \[bu] 2 +\f[C]chapter_number\f[] (numeric): Number of the chapter the video +belongs to +.IP \[bu] 2 +\f[C]chapter_id\f[] (string): Id of the chapter the video belongs to +.PP +Available for the video that is an episode of some series or programme: +.IP \[bu] 2 +\f[C]series\f[] (string): Title of the series or programme the video +episode belongs to +.IP \[bu] 2 +\f[C]season\f[] (string): Title of the season the video episode belongs +to +.IP \[bu] 2 +\f[C]season_number\f[] (numeric): Number of the season the video episode +belongs to +.IP \[bu] 2 +\f[C]season_id\f[] (string): Id of the season the video episode belongs +to +.IP \[bu] 2 +\f[C]episode\f[] (string): Title of the video episode +.IP \[bu] 2 +\f[C]episode_number\f[] (numeric): Number of the video episode within a +season +.IP \[bu] 2 +\f[C]episode_id\f[] (string): Id of the video episode +.PP +Available for the media that is a track or a part of a music album: +.IP \[bu] 2 +\f[C]track\f[] (string): Title of the track +.IP \[bu] 2 +\f[C]track_number\f[] (numeric): Number of the track within an album or +a disc +.IP \[bu] 2 +\f[C]track_id\f[] (string): Id of the track +.IP \[bu] 2 +\f[C]artist\f[] (string): Artist(s) of the track +.IP \[bu] 2 +\f[C]genre\f[] (string): Genre(s) of the track +.IP \[bu] 2 +\f[C]album\f[] (string): Title of the album the track belongs to +.IP \[bu] 2 +\f[C]album_type\f[] (string): Type of the album +.IP \[bu] 2 +\f[C]album_artist\f[] (string): List of all artists appeared on the +album +.IP \[bu] 2 +\f[C]disc_number\f[] (numeric): Number of the disc or other physical +medium the track belongs to +.IP \[bu] 2 +\f[C]release_year\f[] (numeric): Year (YYYY) when the album was released +.PP +Each aforementioned sequence when referenced in an output template will +be replaced by the actual value corresponding to the sequence name. +Note that some of the sequences are not guaranteed to be present since +they depend on the metadata obtained by a particular extractor. +Such sequences will be replaced with \f[C]NA\f[]. +.PP +For example for \f[C]\-o\ %(title)s\-%(id)s.%(ext)s\f[] and an mp4 video +with title \f[C]youtube\-dl\ test\ video\f[] and id +\f[C]BaW_jenozKcj\f[], this will result in a +\f[C]youtube\-dl\ test\ video\-BaW_jenozKcj.mp4\f[] file created in the +current directory. +.PP +For numeric sequences you can use numeric related formatting, for +example, \f[C]%(view_count)05d\f[] will result in a string with view +count padded with zeros up to 5 characters, like in \f[C]00042\f[]. +.PP +Output templates can also contain arbitrary hierarchical path, e.g. +\f[C]\-o\ \[aq]%(playlist)s/%(playlist_index)s\ \-\ %(title)s.%(ext)s\[aq]\f[] +which will result in downloading each video in a directory corresponding +to this path template. +Any missing directory will be automatically created for you. +.PP +To use percent literals in an output template use \f[C]%%\f[]. +To output to stdout use \f[C]\-o\ \-\f[]. .PP The current default template is \f[C]%(title)s\-%(id)s.%(ext)s\f[]. .PP @@ -942,54 +1209,167 @@ or &, such as when transferring the downloaded filename to a Windows system or the filename through an 8bit\-unsafe channel. In these cases, add the \f[C]\-\-restrict\-filenames\f[] flag to get a shorter title: +.SS Output template and Windows batch files +.PP +If you are using an output template inside a Windows batch file then you +must escape plain percent characters (\f[C]%\f[]) by doubling, so that +\f[C]\-o\ "%(title)s\-%(id)s.%(ext)s"\f[] should become +\f[C]\-o\ "%%(title)s\-%%(id)s.%%(ext)s"\f[]. +However you should not touch \f[C]%\f[]\[aq]s that are not plain +characters, e.g. +environment variables for expansion should stay intact: +\f[C]\-o\ "C:\\%HOMEPATH%\\Desktop\\%%(title)s.%%(ext)s"\f[]. +.SS Output template examples +.PP +Note that on Windows you may need to use double quotes instead of +single. .IP .nf \f[C] -$\ youtube\-dl\ \-\-get\-filename\ \-o\ "%(title)s.%(ext)s"\ BaW_jenozKc +$\ youtube\-dl\ \-\-get\-filename\ \-o\ \[aq]%(title)s.%(ext)s\[aq]\ BaW_jenozKc youtube\-dl\ test\ video\ \[aq]\[aq]_ä↭𝕐.mp4\ \ \ \ #\ All\ kinds\ of\ weird\ characters -$\ youtube\-dl\ \-\-get\-filename\ \-o\ "%(title)s.%(ext)s"\ BaW_jenozKc\ \-\-restrict\-filenames + +$\ youtube\-dl\ \-\-get\-filename\ \-o\ \[aq]%(title)s.%(ext)s\[aq]\ BaW_jenozKc\ \-\-restrict\-filenames youtube\-dl_test_video_.mp4\ \ \ \ \ \ \ \ \ \ #\ A\ simple\ file\ name + +#\ Download\ YouTube\ playlist\ videos\ in\ separate\ directory\ indexed\ by\ video\ order\ in\ a\ playlist +$\ youtube\-dl\ \-o\ \[aq]%(playlist)s/%(playlist_index)s\ \-\ %(title)s.%(ext)s\[aq]\ https://www.youtube.com/playlist?list=PLwiyx1dc3P2JR9N8gQaQN_BCvlSlap7re + +#\ Download\ all\ playlists\ of\ YouTube\ channel/user\ keeping\ each\ playlist\ in\ separate\ directory: +$\ youtube\-dl\ \-o\ \[aq]%(uploader)s/%(playlist)s/%(playlist_index)s\ \-\ %(title)s.%(ext)s\[aq]\ https://www.youtube.com/user/TheLinuxFoundation/playlists + +#\ Download\ Udemy\ course\ keeping\ each\ chapter\ in\ separate\ directory\ under\ MyVideos\ directory\ in\ your\ home +$\ youtube\-dl\ \-u\ user\ \-p\ password\ \-o\ \[aq]~/MyVideos/%(playlist)s/%(chapter_number)s\ \-\ %(chapter)s/%(title)s.%(ext)s\[aq]\ https://www.udemy.com/java\-tutorial/ + +#\ Download\ entire\ series\ season\ keeping\ each\ series\ and\ each\ season\ in\ separate\ directory\ under\ C:/MyVideos +$\ youtube\-dl\ \-o\ "C:/MyVideos/%(series)s/%(season_number)s\ \-\ %(season)s/%(episode_number)s\ \-\ %(episode)s.%(ext)s"\ https://videomore.ru/kino_v_detalayah/5_sezon/367617 + +#\ Stream\ the\ video\ being\ downloaded\ to\ stdout +$\ youtube\-dl\ \-o\ \-\ BaW_jenozKc \f[] .fi .SH FORMAT SELECTION .PP -By default youtube\-dl tries to download the best quality, but sometimes -you may want to download in a different format. -The simplest case is requesting a specific format, for example -\f[C]\-f\ 22\f[]. -You can get the list of available formats using -\f[C]\-\-list\-formats\f[], you can also use a file extension (currently -it supports aac, m4a, mp3, mp4, ogg, wav, webm) or the special names -\f[C]best\f[], \f[C]bestvideo\f[], \f[C]bestaudio\f[] and -\f[C]worst\f[]. +By default youtube\-dl tries to download the best available quality, +i.e. +if you want the best quality you \f[B]don\[aq]t need\f[] to pass any +special options, youtube\-dl will guess it for you by \f[B]default\f[]. +.PP +But sometimes you may want to download in a different format, for +example when you are on a slow or intermittent connection. +The key mechanism for achieving this is so\-called \f[I]format +selection\f[] based on which you can explicitly specify desired format, +select formats based on some criterion or criteria, setup precedence and +much more. +.PP +The general syntax for format selection is \f[C]\-\-format\ FORMAT\f[] +or shorter \f[C]\-f\ FORMAT\f[] where \f[C]FORMAT\f[] is a \f[I]selector +expression\f[], i.e. +an expression that describes format or formats you would like to +download. +.PP +\f[B]tl;dr:\f[] navigate me to examples (#format-selection-examples). +.PP +The simplest case is requesting a specific format, for example with +\f[C]\-f\ 22\f[] you can download the format with format code equal to +22. +You can get the list of available format codes for particular video +using \f[C]\-\-list\-formats\f[] or \f[C]\-F\f[]. +Note that these format codes are extractor specific. +.PP +You can also use a file extension (currently \f[C]3gp\f[], \f[C]aac\f[], +\f[C]flv\f[], \f[C]m4a\f[], \f[C]mp3\f[], \f[C]mp4\f[], \f[C]ogg\f[], +\f[C]wav\f[], \f[C]webm\f[] are supported) to download the best quality +format of a particular file extension served as a single file, e.g. +\f[C]\-f\ webm\f[] will download the best quality format with the +\f[C]webm\f[] extension served as a single file. +.PP +You can also use special names to select particular edge case formats: +\- \f[C]best\f[]: Select the best quality format represented by a single +file with video and audio. +\- \f[C]worst\f[]: Select the worst quality format represented by a +single file with video and audio. +\- \f[C]bestvideo\f[]: Select the best quality video\-only format (e.g. +DASH video). +May not be available. +\- \f[C]worstvideo\f[]: Select the worst quality video\-only format. +May not be available. +\- \f[C]bestaudio\f[]: Select the best quality audio only\-format. +May not be available. +\- \f[C]worstaudio\f[]: Select the worst quality audio only\-format. +May not be available. +.PP +For example, to download the worst quality video\-only format you can +use \f[C]\-f\ worstvideo\f[]. .PP If you want to download multiple videos and they don\[aq]t have the same formats available, you can specify the order of preference using -slashes, as in \f[C]\-f\ 22/17/18\f[]. -You can also filter the video results by putting a condition in +slashes. +Note that slash is left\-associative, i.e. +formats on the left hand side are preferred, for example +\f[C]\-f\ 22/17/18\f[] will download format 22 if it\[aq]s available, +otherwise it will download format 17 if it\[aq]s available, otherwise it +will download format 18 if it\[aq]s available, otherwise it will +complain that no suitable formats are available for download. +.PP +If you want to download several formats of the same video use a comma as +a separator, e.g. +\f[C]\-f\ 22,17,18\f[] will download all these three formats, of course +if they are available. +Or a more sophisticated example combined with the precedence feature: +\f[C]\-f\ 136/137/mp4/bestvideo,140/m4a/bestaudio\f[]. +.PP +You can also filter the video formats by putting a condition in brackets, as in \f[C]\-f\ "best[height=720]"\f[] (or \f[C]\-f\ "[filesize>10M]"\f[]). -This works for filesize, height, width, tbr, abr, vbr, asr, and fps and -the comparisons <, <=, >, >=, =, != and for ext, acodec, vcodec, -container, and protocol and the comparisons =, != . +.PP +The following numeric meta fields can be used with comparisons +\f[C]<\f[], \f[C]<=\f[], \f[C]>\f[], \f[C]>=\f[], \f[C]=\f[] (equals), +\f[C]!=\f[] (not equals): \- \f[C]filesize\f[]: The number of bytes, if +known in advance \- \f[C]width\f[]: Width of the video, if known \- +\f[C]height\f[]: Height of the video, if known \- \f[C]tbr\f[]: Average +bitrate of audio and video in KBit/s \- \f[C]abr\f[]: Average audio +bitrate in KBit/s \- \f[C]vbr\f[]: Average video bitrate in KBit/s \- +\f[C]asr\f[]: Audio sampling rate in Hertz \- \f[C]fps\f[]: Frame rate +.PP +Also filtering work for comparisons \f[C]=\f[] (equals), \f[C]!=\f[] +(not equals), \f[C]^=\f[] (begins with), \f[C]$=\f[] (ends with), +\f[C]*=\f[] (contains) and following string meta fields: \- +\f[C]ext\f[]: File extension \- \f[C]acodec\f[]: Name of the audio codec +in use \- \f[C]vcodec\f[]: Name of the video codec in use \- +\f[C]container\f[]: Name of the container format \- \f[C]protocol\f[]: +The protocol that will be used for the actual download, lower\-case +(\f[C]http\f[], \f[C]https\f[], \f[C]rtsp\f[], \f[C]rtmp\f[], +\f[C]rtmpe\f[], \f[C]mms\f[], \f[C]f4m\f[], \f[C]ism\f[], +\f[C]http_dash_segments\f[], \f[C]m3u8\f[], or \f[C]m3u8_native\f[]) \- +\f[C]format_id\f[]: A short description of the format +.PP +Note that none of the aforementioned meta fields are guaranteed to be +present since this solely depends on the metadata obtained by particular +extractor, i.e. +the metadata offered by the video hoster. +.PP Formats for which the value is not known are excluded unless you put a -question mark (?) after the operator. +question mark (\f[C]?\f[]) after the operator. You can combine format filters, so \f[C]\-f\ "[height\ <=?\ 720][tbr>500]"\f[] selects up to 720p videos (or videos where the height is not known) with a bitrate of at least 500 KBit/s. -Use commas to download multiple formats, such as -\f[C]\-f\ 136/137/mp4/bestvideo,140/m4a/bestaudio\f[]. +.PP You can merge the video and audio of two formats into a single file using \f[C]\-f\ +\f[] (requires ffmpeg or -avconv), for example \f[C]\-f\ bestvideo+bestaudio\f[]. +avconv installed), for example \f[C]\-f\ bestvideo+bestaudio\f[] will +download the best video\-only format, the best audio\-only format and +mux them together with ffmpeg/avconv. +.PP Format selectors can also be grouped using parentheses, for example if you want to download the best mp4 and webm formats with a height lower than 480 you can use \f[C]\-f\ \[aq](mp4,webm)[height<480]\[aq]\f[]. .PP -Since the end of April 2015 and version 2015.04.26 youtube\-dl uses -\f[C]\-f\ bestvideo+bestaudio/best\f[] as default format selection (see -#5447, #5456). +Since the end of April 2015 and version 2015.04.26, youtube\-dl uses +\f[C]\-f\ bestvideo+bestaudio/best\f[] as the default format selection +(see #5447 (https://github.com/rg3/youtube-dl/issues/5447), +#5456 (https://github.com/rg3/youtube-dl/issues/5456)). If ffmpeg or avconv are installed this results in downloading \f[C]bestvideo\f[] and \f[C]bestaudio\f[] separately and muxing them together into a single file giving the best overall quality available. @@ -998,7 +1378,7 @@ best available quality served as a single file. \f[C]best\f[] is also needed for videos that don\[aq]t come from YouTube because they don\[aq]t provide the audio and video in two different files. -If you want to only download some dash formats (for example if you are +If you want to only download some DASH formats (for example if you are not interested in getting videos with a resolution higher than 1080p), you can add \f[C]\-f\ bestvideo[height<=?1080]+bestaudio/best\f[] to your configuration file. @@ -1015,6 +1395,32 @@ you want to download the best available quality media served as a single file, you should explicitly specify your choice with \f[C]\-f\ best\f[]. You may want to add it to the configuration file (#configuration) in order not to type it every time you run youtube\-dl. +.SS Format selection examples +.PP +Note that on Windows you may need to use double quotes instead of +single. +.IP +.nf +\f[C] +#\ Download\ best\ mp4\ format\ available\ or\ any\ other\ best\ if\ no\ mp4\ available +$\ youtube\-dl\ \-f\ \[aq]bestvideo[ext=mp4]+bestaudio[ext=m4a]/best[ext=mp4]/best\[aq] + +#\ Download\ best\ format\ available\ but\ not\ better\ that\ 480p +$\ youtube\-dl\ \-f\ \[aq]bestvideo[height<=480]+bestaudio/best[height<=480]\[aq] + +#\ Download\ best\ video\ only\ format\ but\ no\ bigger\ than\ 50\ MB +$\ youtube\-dl\ \-f\ \[aq]best[filesize<50M]\[aq] + +#\ Download\ best\ format\ available\ via\ direct\ link\ over\ HTTP/HTTPS\ protocol +$\ youtube\-dl\ \-f\ \[aq](bestvideo+bestaudio/best)[protocol^=http]\[aq] + +#\ Download\ the\ best\ video\ format\ and\ the\ best\ audio\ format\ without\ merging\ them +$\ youtube\-dl\ \-f\ \[aq]bestvideo,bestaudio\[aq]\ \-o\ \[aq]%(title)s.f%(format_id)s.%(ext)s\[aq] +\f[] +.fi +.PP +Note that in the last example, an output template is recommended as +bestvideo and bestaudio may have the same file name. .SH VIDEO SELECTION .PP Videos can be filtered by their upload date using the options @@ -1044,7 +1450,7 @@ $\ youtube\-dl\ \-\-dateafter\ 20000101\ \-\-datebefore\ 20091231 .SS How do I update youtube\-dl? .PP If you\[aq]ve followed our manual installation -instructions (http://rg3.github.io/youtube-dl/download.html), you can +instructions (https://rg3.github.io/youtube-dl/download.html), you can simply run \f[C]youtube\-dl\ \-U\f[] (or, on Linux, \f[C]sudo\ youtube\-dl\ \-U\f[]). .PP @@ -1057,7 +1463,7 @@ mechanism to update. Note that distribution packages are often outdated. As a rule of thumb, youtube\-dl releases at least once a month, and often weekly or even daily. -Simply go to http://yt\-dl.org/ to find out the current version. +Simply go to https://yt\-dl.org to find out the current version. Unfortunately, there is nothing we youtube\-dl developers can do if your distribution serves a really outdated version. You can (and should) complain to your distribution in their bugtracker @@ -1074,7 +1480,7 @@ sudo\ apt\-get\ remove\ \-y\ youtube\-dl .fi .PP Afterwards, simply follow our manual installation -instructions (http://rg3.github.io/youtube-dl/download.html): +instructions (https://rg3.github.io/youtube-dl/download.html): .IP .nf \f[C] @@ -1086,6 +1492,10 @@ hash\ \-r .PP Again, from then on you\[aq]ll be able to update with \f[C]sudo\ youtube\-dl\ \-U\f[]. +.SS youtube\-dl is extremely slow to start on Windows +.PP +Add a file exclusion for \f[C]youtube\-dl.exe\f[] in Windows Defender +settings. .SS I\[aq]m getting an error \f[C]Unable\ to\ extract\ OpenGraph\ title\f[] on YouTube playlists .PP @@ -1100,10 +1510,18 @@ Since we are not affiliated with Ubuntu, there is little we can do. Feel free to report bugs (https://bugs.launchpad.net/ubuntu/+source/youtube-dl/+filebug) to the Ubuntu packaging -guys (mailto:ubuntu-motu@lists.ubuntu.com?subject=outdated%20version%20of%20youtube-dl) +people (mailto:ubuntu-motu@lists.ubuntu.com?subject=outdated%20version%20of%20youtube-dl) \- all they have to do is update the package to a somewhat recent version. See above for a way to update. +.SS I\[aq]m getting an error when trying to use output template: +\f[C]error:\ using\ output\ template\ conflicts\ with\ using\ title,\ video\ ID\ or\ auto\ number\f[] +.PP +Make sure you are not using \f[C]\-o\f[] with any of these options +\f[C]\-t\f[], \f[C]\-\-title\f[], \f[C]\-\-id\f[], \f[C]\-A\f[] or +\f[C]\-\-auto\-number\f[] set in command line or in a configuration +file. +Remove the latter if any. .SS Do I always have to pass \f[C]\-citw\f[]? .PP By default, youtube\-dl intends to have the best options (incidentally, @@ -1130,22 +1548,40 @@ Apparently YouTube requires you to pass a CAPTCHA test if you download too much. We\[aq]re considering to provide a way to let you solve the CAPTCHA (https://github.com/rg3/youtube-dl/issues/154), but at the -moment, your best course of action is pointing a webbrowser to the +moment, your best course of action is pointing a web browser to the youtube URL, solving the CAPTCHA, and restart youtube\-dl. +.SS Do I need any other programs? +.PP +youtube\-dl works fine on its own on most sites. +However, if you want to convert video/audio, you\[aq]ll need +avconv (https://libav.org/) or ffmpeg (https://www.ffmpeg.org/). +On some sites \- most notably YouTube \- videos can be retrieved in a +higher quality format without sound. +youtube\-dl will detect whether avconv/ffmpeg is present and +automatically pick the best option. +.PP +Videos or video formats streamed via RTMP protocol can only be +downloaded when rtmpdump (https://rtmpdump.mplayerhq.hu/) is installed. +Downloading MMS and RTSP videos requires either +mplayer (https://mplayerhq.hu/) or mpv (https://mpv.io/) to be +installed. .SS I have downloaded a video but how can I play it? .PP Once the video is fully downloaded, use any video player, such as -vlc (http://www.videolan.org) or mplayer (http://www.mplayerhq.hu/). +mpv (https://mpv.io/), vlc (https://www.videolan.org/) or +mplayer (https://www.mplayerhq.hu/). .SS I extracted a video URL with \f[C]\-g\f[], but it does not play on -another machine / in my webbrowser. +another machine / in my web browser. .PP It depends a lot on the service. In many cases, requests for the video (to download/play it) must come -from the same IP address and with the same cookies. +from the same IP address and with the same cookies and/or HTTP headers. Use the \f[C]\-\-cookies\f[] option to write the required cookies into a file, and advise your downloader to read cookies from that file. Some sites also require a common user agent to be used, use \f[C]\-\-dump\-user\-agent\f[] to see the one in use by youtube\-dl. +You can also get necessary cookies and HTTP headers from JSON output +obtained with \f[C]\-\-dump\-json\f[]. .PP It may be beneficial to use IPv6; in some cases, the restrictions are only applied to IPv4. @@ -1228,17 +1664,18 @@ means you\[aq]re using an outdated version of Python. Please update to Python 2.6 or 2.7. .SS What is this binary file? Where has the code gone? .PP -Since June 2012 (#342) youtube\-dl is packed as an executable zipfile, -simply unzip it (might need renaming to \f[C]youtube\-dl.zip\f[] first -on some systems) or clone the git repository, as laid out above. +Since June 2012 (#342 (https://github.com/rg3/youtube-dl/issues/342)) +youtube\-dl is packed as an executable zipfile, simply unzip it (might +need renaming to \f[C]youtube\-dl.zip\f[] first on some systems) or +clone the git repository, as laid out above. If you modify the code, you can run it by executing the \f[C]__main__.py\f[] file. To recompile the executable, run \f[C]make\ youtube\-dl\f[]. -.SS The exe throws a \f[I]Runtime error from Visual C++\f[] +.SS The exe throws an error due to missing \f[C]MSVCR100.dll\f[] .PP -To run the exe you need to install first the Microsoft Visual C++ 2008 -Redistributable -Package (http://www.microsoft.com/en-us/download/details.aspx?id=29). +To run the exe you need to install first the Microsoft Visual C++ 2010 +Redistributable Package +(x86) (https://www.microsoft.com/en-US/download/details.aspx?id=5555). .SS On Windows, how should I set up ffmpeg and youtube\-dl? Where should I put the exe files? .PP @@ -1263,21 +1700,30 @@ Use the \f[C]\-o\f[] to specify an output template (#output-template), for example \f[C]\-o\ "/home/user/videos/%(title)s\-%(id)s.%(ext)s"\f[]. If you want this for all of your downloads, put the option into your configuration file (#configuration). -.SS How do I download a video starting with a \f[C]\-\f[] ? +.SS How do I download a video starting with a \f[C]\-\f[]? .PP -Either prepend \f[C]http://www.youtube.com/watch?v=\f[] or separate the +Either prepend \f[C]https://www.youtube.com/watch?v=\f[] or separate the ID from the options with \f[C]\-\-\f[]: .IP .nf \f[C] youtube\-dl\ \-\-\ \-wNyEUrxzFU -youtube\-dl\ "http://www.youtube.com/watch?v=\-wNyEUrxzFU" +youtube\-dl\ "https://www.youtube.com/watch?v=\-wNyEUrxzFU" \f[] .fi .SS How do I pass cookies to youtube\-dl? .PP Use the \f[C]\-\-cookies\f[] option, for example \f[C]\-\-cookies\ /path/to/cookies/file.txt\f[]. +.PP +In order to extract cookies from browser use any conforming browser +extension for exporting cookies. +For example, +cookies.txt (https://chrome.google.com/webstore/detail/cookiestxt/njabckikapfpffapmjgojcnbfjonfjfg) +(for Chrome) or Export +Cookies (https://addons.mozilla.org/en-US/firefox/addon/export-cookies/) +(for Firefox). +.PP Note that the cookies file must be in Mozilla/Netscape format and the first line of the cookies file must be either \f[C]#\ HTTP\ Cookie\ File\f[] or @@ -1285,13 +1731,84 @@ first line of the cookies file must be either Make sure you have correct newline format (https://en.wikipedia.org/wiki/Newline) in the cookies file and convert newlines if necessary to correspond with your OS, namely -\f[C]CRLF\f[] (\f[C]\\r\\n\f[]) for Windows, \f[C]LF\f[] (\f[C]\\n\f[]) -for Linux and \f[C]CR\f[] (\f[C]\\r\f[]) for Mac OS. +\f[C]CRLF\f[] (\f[C]\\r\\n\f[]) for Windows and \f[C]LF\f[] +(\f[C]\\n\f[]) for Unix and Unix\-like systems (Linux, Mac OS, etc.). \f[C]HTTP\ Error\ 400:\ Bad\ Request\f[] when using \f[C]\-\-cookies\f[] is a good sign of invalid newline format. .PP Passing cookies to youtube\-dl is a good way to workaround login when a particular extractor does not implement it explicitly. +Another use case is working around +CAPTCHA (https://en.wikipedia.org/wiki/CAPTCHA) some websites require +you to solve in particular cases in order to get access (e.g. +YouTube, CloudFlare). +.SS How do I stream directly to media player? +.PP +You will first need to tell youtube\-dl to stream media to stdout with +\f[C]\-o\ \-\f[], and also tell your media player to read from stdin (it +must be capable of this for streaming) and then pipe former to latter. +For example, streaming to vlc (https://www.videolan.org/) can be +achieved with: +.IP +.nf +\f[C] +youtube\-dl\ \-o\ \-\ "https://www.youtube.com/watch?v=BaW_jenozKcj"\ |\ vlc\ \- +\f[] +.fi +.SS How do I download only new videos from a playlist? +.PP +Use download\-archive feature. +With this feature you should initially download the complete playlist +with \f[C]\-\-download\-archive\ /path/to/download/archive/file.txt\f[] +that will record identifiers of all the videos in a special file. +Each subsequent run with the same \f[C]\-\-download\-archive\f[] will +download only new videos and skip all videos that have been downloaded +before. +Note that only successful downloads are recorded in the file. +.PP +For example, at first, +.IP +.nf +\f[C] +youtube\-dl\ \-\-download\-archive\ archive.txt\ "https://www.youtube.com/playlist?list=PLwiyx1dc3P2JR9N8gQaQN_BCvlSlap7re" +\f[] +.fi +.PP +will download the complete \f[C]PLwiyx1dc3P2JR9N8gQaQN_BCvlSlap7re\f[] +playlist and create a file \f[C]archive.txt\f[]. +Each subsequent run will only download new videos if any: +.IP +.nf +\f[C] +youtube\-dl\ \-\-download\-archive\ archive.txt\ "https://www.youtube.com/playlist?list=PLwiyx1dc3P2JR9N8gQaQN_BCvlSlap7re" +\f[] +.fi +.SS Should I add \f[C]\-\-hls\-prefer\-native\f[] into my config? +.PP +When youtube\-dl detects an HLS video, it can download it either with +the built\-in downloader or ffmpeg. +Since many HLS streams are slightly invalid and ffmpeg/youtube\-dl each +handle some invalid cases better than the other, there is an option to +switch the downloader if needed. +.PP +When youtube\-dl knows that one particular downloader works better for a +given website, that downloader will be picked. +Otherwise, youtube\-dl will pick the best downloader for general +compatibility, which at the moment happens to be ffmpeg. +This choice may change in future versions of youtube\-dl, with +improvements of the built\-in downloader and/or ffmpeg. +.PP +In particular, the generic extractor (used when your website is not in +the list of supported sites by +youtube\-dl (https://rg3.github.io/youtube-dl/supportedsites.html) +cannot mandate one specific downloader. +.PP +If you put either \f[C]\-\-hls\-prefer\-native\f[] or +\f[C]\-\-hls\-prefer\-ffmpeg\f[] into your configuration, a different +subset of videos will fail to download correctly. +Instead, it is much better to file an issue (https://yt-dl.org/bug) or a +pull request which details why the native or the ffmpeg HLS downloader +is a better choice for your use case. .SS Can you add support for this anime video site, or site which shows current movies for free? .PP @@ -1351,8 +1868,8 @@ Please do not declare your issue as \f[C]important\f[] or For one, have a look at the list of supported sites (docs/supportedsites.md). Note that it can sometimes happen that the site changes its URL scheme -(say, from http://example.com/video/1234567 to -http://example.com/v/1234567 ) and youtube\-dl reports an URL of a +(say, from https://example.com/video/1234567 to +https://example.com/v/1234567 ) and youtube\-dl reports an URL of a service in that list as unsupported. In that case, simply report a bug. .PP @@ -1373,10 +1890,31 @@ to a video or unsupported. You can find out which by examining the output (if you run youtube\-dl on the console) or catching an \f[C]UnsupportedError\f[] exception if you run it from a Python program. +.SH Why do I need to go through that much red tape when filing bugs? +.PP +Before we had the issue template, despite our extensive bug reporting +instructions (#bugs), about 80% of the issue reports we got were +useless, for instance because people used ancient versions hundreds of +releases old, because of simple syntactic errors (not in youtube\-dl but +in general shell usage), because the problem was already reported +multiple times before, because people did not actually read an error +message, even if it said "please install ffmpeg", because people did not +mention the URL they were trying to download and many more simple, +easy\-to\-avoid problems, many of whom were totally unrelated to +youtube\-dl. +.PP +youtube\-dl is an open\-source project manned by too few volunteers, so +we\[aq]d rather spend time fixing bugs where we are certain none of +those simple problems apply, and where we can be reasonably confident to +be able to reproduce the issue without asking the reporter repeatedly. +As such, the output of \f[C]youtube\-dl\ \-v\ YOUR_URL_HERE\f[] is +really all that\[aq]s required to file an issue. +The issue template also guides you through some basic steps you can do, +such as checking that your version of youtube\-dl is current. .SH DEVELOPER INSTRUCTIONS .PP Most users do not need to build youtube\-dl and can download the -builds (http://rg3.github.io/youtube-dl/download.html) or get them from +builds (https://rg3.github.io/youtube-dl/download.html) or get them from their distribution. .PP To run youtube\-dl as a developer, you don\[aq]t need to build anything @@ -1400,11 +1938,14 @@ nosetests \f[] .fi .PP +See item 6 of new extractor tutorial (#adding-support-for-a-new-site) +for how to run extractor specific test cases. +.PP If you want to create a build of youtube\-dl yourself, you\[aq]ll need .IP \[bu] 2 python .IP \[bu] 2 -make +make (only GNU make is supported) .IP \[bu] 2 pandoc .IP \[bu] 2 @@ -1413,16 +1954,38 @@ zip nosetests .SS Adding support for a new site .PP -If you want to add support for a new site, you can follow this quick -list (assuming your service is called \f[C]yourextractor\f[]): +If you want to add support for a new site, first of all \f[B]make +sure\f[] this site is \f[B]not dedicated to copyright +infringement (README.md#can-you-add-support-for-this-anime-video-site-or-site-which-shows-current-movies-for-free)\f[]. +youtube\-dl does \f[B]not support\f[] such sites thus pull requests +adding support for them \f[B]will be rejected\f[]. +.PP +After you have ensured this site is distributing its content legally, +you can follow this quick list (assuming your service is called +\f[C]yourextractor\f[]): .IP " 1." 4 Fork this repository (https://github.com/rg3/youtube-dl/fork) .IP " 2." 4 -Check out the source code with -\f[C]git\ clone\ git\@github.com:YOUR_GITHUB_USERNAME/youtube\-dl.git\f[] +Check out the source code with: +.RS 4 +.IP +.nf +\f[C] +git\ clone\ git\@github.com:YOUR_GITHUB_USERNAME/youtube\-dl.git +\f[] +.fi +.RE .IP " 3." 4 Start a new git branch with -\f[C]cd\ youtube\-dl;\ git\ checkout\ \-b\ yourextractor\f[] +.RS 4 +.IP +.nf +\f[C] +cd\ youtube\-dl +git\ checkout\ \-b\ yourextractor +\f[] +.fi +.RE .IP " 4." 4 Start with this simple template and save it to \f[C]youtube_dl/extractor/yourextractor.py\f[]: @@ -1439,13 +2002,13 @@ from\ .common\ import\ InfoExtractor class\ YourExtractorIE(InfoExtractor): \ \ \ \ _VALID_URL\ =\ r\[aq]https?://(?:www\\.)?yourextractor\\.com/watch/(?P[0\-9]+)\[aq] \ \ \ \ _TEST\ =\ { -\ \ \ \ \ \ \ \ \[aq]url\[aq]:\ \[aq]http://yourextractor.com/watch/42\[aq], +\ \ \ \ \ \ \ \ \[aq]url\[aq]:\ \[aq]https://yourextractor.com/watch/42\[aq], \ \ \ \ \ \ \ \ \[aq]md5\[aq]:\ \[aq]TODO:\ md5\ sum\ of\ the\ first\ 10241\ bytes\ of\ the\ video\ file\ (use\ \-\-test)\[aq], \ \ \ \ \ \ \ \ \[aq]info_dict\[aq]:\ { \ \ \ \ \ \ \ \ \ \ \ \ \[aq]id\[aq]:\ \[aq]42\[aq], \ \ \ \ \ \ \ \ \ \ \ \ \[aq]ext\[aq]:\ \[aq]mp4\[aq], \ \ \ \ \ \ \ \ \ \ \ \ \[aq]title\[aq]:\ \[aq]Video\ title\ goes\ here\[aq], -\ \ \ \ \ \ \ \ \ \ \ \ \[aq]thumbnail\[aq]:\ \[aq]re:^https?://.*\\.jpg$\[aq], +\ \ \ \ \ \ \ \ \ \ \ \ \[aq]thumbnail\[aq]:\ r\[aq]re:^https?://.*\\.jpg$\[aq], \ \ \ \ \ \ \ \ \ \ \ \ #\ TODO\ more\ properties,\ either\ as: \ \ \ \ \ \ \ \ \ \ \ \ #\ *\ A\ value \ \ \ \ \ \ \ \ \ \ \ \ #\ *\ MD5\ checksum;\ start\ the\ string\ with\ md5: @@ -1473,7 +2036,7 @@ class\ YourExtractorIE(InfoExtractor): .RE .IP " 5." 4 Add an import in -\f[C]youtube_dl/extractor/__init__.py\f[] (https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/__init__.py). +\f[C]youtube_dl/extractor/extractors.py\f[] (https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/extractors.py). .IP " 6." 4 Run \f[C]python\ test/test_download.py\ TestDownload.test_YourExtractor\f[]. @@ -1484,25 +2047,31 @@ If you decide to add more than one test, then rename \f[C]_TEST\f[] to The tests will then be named \f[C]TestDownload.test_YourExtractor\f[], \f[C]TestDownload.test_YourExtractor_1\f[], \f[C]TestDownload.test_YourExtractor_2\f[], etc. +Note that tests with \f[C]only_matching\f[] key in test\[aq]s dict are +not counted in. .IP " 7." 4 Have a look at \f[C]youtube_dl/extractor/common.py\f[] (https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py) for possible helper methods and a detailed description of what your extractor should and may -return (https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py#L62-L200). +return (https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py#L74-L252). Add tests and code for as many as you want. .IP " 8." 4 -If you can, check the code with +Make sure your code follows youtube\-dl coding +conventions (#youtube-dl-coding-conventions) and check the code with flake8 (https://pypi.python.org/pypi/flake8). +Also make sure your code works under all +Python (https://www.python.org/) versions claimed supported by +youtube\-dl, namely 2.6, 2.7, and 3.2+. .IP " 9." 4 -When the tests pass, add (http://git-scm.com/docs/git-add) the new files -and commit (http://git-scm.com/docs/git-commit) them and -push (http://git-scm.com/docs/git-push) the result, like this: +When the tests pass, add (https://git-scm.com/docs/git-add) the new +files and commit (https://git-scm.com/docs/git-commit) them and +push (https://git-scm.com/docs/git-push) the result, like this: .RS 4 .IP .nf \f[C] -$\ git\ add\ youtube_dl/extractor/__init__.py +$\ git\ add\ youtube_dl/extractor/extractors.py $\ git\ add\ youtube_dl/extractor/yourextractor.py $\ git\ commit\ \-m\ \[aq][yourextractor]\ Add\ new\ extractor\[aq] $\ git\ push\ origin\ yourextractor @@ -1515,6 +2084,222 @@ request (https://help.github.com/articles/creating-a-pull-request). We\[aq]ll then review and merge it. .PP In any case, thank you very much for your contributions! +.SS youtube\-dl coding conventions +.PP +This section introduces a guide lines for writing idiomatic, robust and +future\-proof extractor code. +.PP +Extractors are very fragile by nature since they depend on the layout of +the source data provided by 3rd party media hosters out of your control +and this layout tends to change. +As an extractor implementer your task is not only to write code that +will extract media links and metadata correctly but also to minimize +dependency on the source\[aq]s layout and even to make the code foresee +potential future changes and be ready for that. +This is important because it will allow the extractor not to break on +minor layout changes thus keeping old youtube\-dl versions working. +Even though this breakage issue is easily fixed by emitting a new +version of youtube\-dl with a fix incorporated, all the previous +versions become broken in all repositories and distros\[aq] packages +that may not be so prompt in fetching the update from us. +Needless to say, some non rolling release distros may never receive an +update at all. +.SS Mandatory and optional metafields +.PP +For extraction to work youtube\-dl relies on metadata your extractor +extracts and provides to youtube\-dl expressed by an information +dictionary (https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py#L75-L257) +or simply \f[I]info dict\f[]. +Only the following meta fields in the \f[I]info dict\f[] are considered +mandatory for a successful extraction process by youtube\-dl: +.IP \[bu] 2 +\f[C]id\f[] (media identifier) +.IP \[bu] 2 +\f[C]title\f[] (media title) +.IP \[bu] 2 +\f[C]url\f[] (media download URL) or \f[C]formats\f[] +.PP +In fact only the last option is technically mandatory (i.e. +if you can\[aq]t figure out the download location of the media the +extraction does not make any sense). +But by convention youtube\-dl also treats \f[C]id\f[] and \f[C]title\f[] +as mandatory. +Thus the aforementioned metafields are the critical data that the +extraction does not make any sense without and if any of them fail to be +extracted then the extractor is considered completely broken. +.PP +Any +field (https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py#L149-L257) +apart from the aforementioned ones are considered \f[B]optional\f[]. +That means that extraction should be \f[B]tolerant\f[] to situations +when sources for these fields can potentially be unavailable (even if +they are always available at the moment) and \f[B]future\-proof\f[] in +order not to break the extraction of general purpose mandatory fields. +.SS Example +.PP +Say you have some source dictionary \f[C]meta\f[] that you\[aq]ve +fetched as JSON with HTTP request and it has a key \f[C]summary\f[]: +.IP +.nf +\f[C] +meta\ =\ self._download_json(url,\ video_id) +\f[] +.fi +.PP +Assume at this point \f[C]meta\f[]\[aq]s layout is: +.IP +.nf +\f[C] +{ +\ \ \ \ ... +\ \ \ \ "summary":\ "some\ fancy\ summary\ text", +\ \ \ \ ... +} +\f[] +.fi +.PP +Assume you want to extract \f[C]summary\f[] and put it into the +resulting info dict as \f[C]description\f[]. +Since \f[C]description\f[] is an optional meta field you should be ready +that this key may be missing from the \f[C]meta\f[] dict, so that you +should extract it like: +.IP +.nf +\f[C] +description\ =\ meta.get(\[aq]summary\[aq])\ \ #\ correct +\f[] +.fi +.PP +and not like: +.IP +.nf +\f[C] +description\ =\ meta[\[aq]summary\[aq]]\ \ #\ incorrect +\f[] +.fi +.PP +The latter will break extraction process with \f[C]KeyError\f[] if +\f[C]summary\f[] disappears from \f[C]meta\f[] at some later time but +with the former approach extraction will just go ahead with +\f[C]description\f[] set to \f[C]None\f[] which is perfectly fine +(remember \f[C]None\f[] is equivalent to the absence of data). +.PP +Similarly, you should pass \f[C]fatal=False\f[] when extracting optional +data from a webpage with \f[C]_search_regex\f[], +\f[C]_html_search_regex\f[] or similar methods, for instance: +.IP +.nf +\f[C] +description\ =\ self._search_regex( +\ \ \ \ r\[aq]]+id="title"[^>]*>([^<]+)<\[aq], +\ \ \ \ webpage,\ \[aq]description\[aq],\ fatal=False) +\f[] +.fi +.PP +With \f[C]fatal\f[] set to \f[C]False\f[] if \f[C]_search_regex\f[] +fails to extract \f[C]description\f[] it will emit a warning and +continue extraction. +.PP +You can also pass \f[C]default=\f[], for example: +.IP +.nf +\f[C] +description\ =\ self._search_regex( +\ \ \ \ r\[aq]]+id="title"[^>]*>([^<]+)<\[aq], +\ \ \ \ webpage,\ \[aq]description\[aq],\ default=None) +\f[] +.fi +.PP +On failure this code will silently continue the extraction with +\f[C]description\f[] set to \f[C]None\f[]. +That is useful for metafields that may or may not be present. +.SS Provide fallbacks +.PP +When extracting metadata try to do so from multiple sources. +For example if \f[C]title\f[] is present in several places, try +extracting from at least some of them. +This makes it more future\-proof in case some of the sources become +unavailable. +.SS Example +.PP +Say \f[C]meta\f[] from the previous example has a \f[C]title\f[] and you +are about to extract it. +Since \f[C]title\f[] is a mandatory meta field you should end up with +something like: +.IP +.nf +\f[C] +title\ =\ meta[\[aq]title\[aq]] +\f[] +.fi +.PP +If \f[C]title\f[] disappears from \f[C]meta\f[] in future due to some +changes on the hoster\[aq]s side the extraction would fail since +\f[C]title\f[] is mandatory. +That\[aq]s expected. +.PP +Assume that you have some another source you can extract \f[C]title\f[] +from, for example \f[C]og:title\f[] HTML meta of a \f[C]webpage\f[]. +In this case you can provide a fallback scenario: +.IP +.nf +\f[C] +title\ =\ meta.get(\[aq]title\[aq])\ or\ self._og_search_title(webpage) +\f[] +.fi +.PP +This code will try to extract from \f[C]meta\f[] first and if it fails +it will try extracting \f[C]og:title\f[] from a \f[C]webpage\f[]. +.SS Make regular expressions flexible +.PP +When using regular expressions try to write them fuzzy and flexible. +.SS Example +.PP +Say you need to extract \f[C]title\f[] from the following HTML code: +.IP +.nf +\f[C] +some\ fancy\ title +\f[] +.fi +.PP +The code for that task should look similar to: +.IP +.nf +\f[C] +title\ =\ self._search_regex( +\ \ \ \ r\[aq]]+class="title"[^>]*>([^<]+)\[aq],\ webpage,\ \[aq]title\[aq]) +\f[] +.fi +.PP +Or even better: +.IP +.nf +\f[C] +title\ =\ self._search_regex( +\ \ \ \ r\[aq]]+class=(["\\\[aq]])title\\1[^>]*>(?P[^<]+)\[aq], +\ \ \ \ webpage,\ \[aq]title\[aq],\ group=\[aq]title\[aq]) +\f[] +.fi +.PP +Note how you tolerate potential changes in the \f[C]style\f[] +attribute\[aq]s value or switch from using double quotes to single for +\f[C]class\f[] attribute: +.PP +The code definitely should not look like: +.IP +.nf +\f[C] +title\ =\ self._search_regex( +\ \ \ \ r\[aq]<span\ style="position:\ absolute;\ left:\ 910px;\ width:\ 90px;\ float:\ right;\ z\-index:\ 9999;"\ class="title">(.*?)</span>\[aq], +\ \ \ \ webpage,\ \[aq]title\[aq],\ group=\[aq]title\[aq]) +\f[] +.fi +.SS Use safe conversion functions +.PP +Wrap all extracted numeric data into safe functions from \f[C]utils\f[]: +\f[C]int_or_none\f[], \f[C]float_or_none\f[]. +Use them for string to number conversions as well. .SH EMBEDDING YOUTUBE\-DL .PP youtube\-dl makes the best effort to be a good command\-line program, @@ -1532,13 +2317,13 @@ import\ youtube_dl ydl_opts\ =\ {} with\ youtube_dl.YoutubeDL(ydl_opts)\ as\ ydl: -\ \ \ \ ydl.download([\[aq]http://www.youtube.com/watch?v=BaW_jenozKc\[aq]]) +\ \ \ \ ydl.download([\[aq]https://www.youtube.com/watch?v=BaW_jenozKc\[aq]]) \f[] .fi .PP Most likely, you\[aq]ll want to use various options. -For a list of what can be done, have a look at -youtube_dl/YoutubeDL.py (https://github.com/rg3/youtube-dl/blob/master/youtube_dl/YoutubeDL.py#L117-L265). +For a list of options available, have a look at +\f[C]youtube_dl/YoutubeDL.py\f[] (https://github.com/rg3/youtube-dl/blob/3e4cedf9e8cd3157df2457df7274d0c842421945/youtube_dl/YoutubeDL.py#L137-L312). For a start, if you want to intercept youtube\-dl\[aq]s output, set a \f[C]logger\f[] object. .PP @@ -1579,20 +2364,45 @@ ydl_opts\ =\ { \ \ \ \ \[aq]progress_hooks\[aq]:\ [my_hook], } with\ youtube_dl.YoutubeDL(ydl_opts)\ as\ ydl: -\ \ \ \ ydl.download([\[aq]http://www.youtube.com/watch?v=BaW_jenozKc\[aq]]) +\ \ \ \ ydl.download([\[aq]https://www.youtube.com/watch?v=BaW_jenozKc\[aq]]) \f[] .fi .SH BUGS .PP Bugs and suggestions should be reported at: -<https://github.com/rg3/youtube-dl/issues> . -Unless you were prompted so or there is another pertinent reason (e.g. +<https://github.com/rg3/youtube-dl/issues>. +Unless you were prompted to or there is another pertinent reason (e.g. GitHub fails to accept the bug report), please do not send bug reports via personal email. -For discussions, join us in the irc channel #youtube\-dl on freenode. +For discussions, join us in the IRC channel +#youtube\-dl (irc://chat.freenode.net/#youtube-dl) on freenode +(webchat (https://webchat.freenode.net/?randomnick=1&channels=youtube-dl)). .PP \f[B]Please include the full output of youtube\-dl when run with -\f[C]\-v\f[]\f[]. +\f[C]\-v\f[]\f[], i.e. +\f[B]add\f[] \f[C]\-v\f[] flag to \f[B]your command line\f[], copy the +\f[B]whole\f[] output and post it in the issue body wrapped in ``` for +better formatting. +It should look similar to this: +.IP +.nf +\f[C] +$\ youtube\-dl\ \-v\ <your\ command\ line> +[debug]\ System\ config:\ [] +[debug]\ User\ config:\ [] +[debug]\ Command\-line\ args:\ [u\[aq]\-v\[aq],\ u\[aq]https://www.youtube.com/watch?v=BaW_jenozKcj\[aq]] +[debug]\ Encodings:\ locale\ cp1251,\ fs\ mbcs,\ out\ cp866,\ pref\ cp1251 +[debug]\ youtube\-dl\ version\ 2015.12.06 +[debug]\ Git\ HEAD:\ 135392e +[debug]\ Python\ version\ 2.6.6\ \-\ Windows\-2003Server\-5.2.3790\-SP2 +[debug]\ exe\ versions:\ ffmpeg\ N\-75573\-g1d0487f,\ ffprobe\ N\-75573\-g1d0487f,\ rtmpdump\ 2.4 +[debug]\ Proxy\ map:\ {} +\&... +\f[] +.fi +.PP +\f[B]Do not post screenshots of verbose logs; only plain text is +acceptable.\f[] .PP The output (including the first lines) contains important debugging information. @@ -1624,7 +2434,7 @@ If your report is shorter than two lines, it is almost certainly missing some of these, which makes it hard for us to respond to it. We\[aq]re often too polite to close the issue outright, but the missing info makes misinterpretation likely. -As a commiter myself, I often get frustrated by these issues, since the +As a committer myself, I often get frustrated by these issues, since the only possible way for me to move forward on them is to ask for clarification over and over. .PP @@ -1645,11 +2455,11 @@ command\-line) or upload the \f[C]\&.dump\f[] files you get when you add .PP \f[B]Site support requests must contain an example URL\f[]. An example URL is a URL you might want to download, like -http://www.youtube.com/watch?v=BaW_jenozKc . +\f[C]https://www.youtube.com/watch?v=BaW_jenozKc\f[]. There should be an obvious video present. Except under very special circumstances, the main page of a video service (e.g. -http://www.youtube.com/ ) is \f[I]not\f[] an example URL. +\f[C]https://www.youtube.com/\f[]) is \f[I]not\f[] an example URL. .SS Are you using the latest version? .PP Before reporting any issue, type \f[C]youtube\-dl\ \-U\f[]. @@ -1661,8 +2471,9 @@ This goes for feature requests as well. .PP Make sure that someone has not already opened the issue you\[aq]re trying to open. -Search at the top of the window or at -https://github.com/rg3/youtube\-dl/search?type=Issues . +Search at the top of the window or browse the GitHub +Issues (https://github.com/rg3/youtube-dl/search?type=Issues) of this +repository. If there is an issue, feel free to write something along the lines of "This affects me as well, with version 2015.01.01. Here is some more information on the issue: ...". @@ -1672,7 +2483,7 @@ activity. .PP Before requesting a new feature, please have a quick peek at the list of supported -options (https://github.com/rg3/youtube-dl/blob/master/README.md#synopsis). +options (https://github.com/rg3/youtube-dl/blob/master/README.md#options). Many feature requests are for features that actually exist already! Please, absolutely do show off your work in the issue report and detail how the existing similar options do \f[I]not\f[] solve your problem. @@ -1709,7 +2520,7 @@ splits the issue into multiple ones. In particular, every site support request issue should only pertain to services at one site (generally under a common domain, but always using the same backend technology). -Do not request support for vimeo user videos, Whitehouse podcasts, and +Do not request support for vimeo user videos, White house podcasts, and Google Plus pages in the same issue. Also, make sure that you don\[aq]t post bug reports alongside feature requests. @@ -1727,8 +2538,8 @@ requires them. .SS Is your question about youtube\-dl? .PP It may sound strange, but some bug reports we receive are completely -unrelated to youtube\-dl and relate to a different or even the -reporter\[aq]s own application. +unrelated to youtube\-dl and relate to a different, or even the +reporter\[aq]s own, application. Please make sure that you are actually using youtube\-dl. If you are using a UI for youtube\-dl, report the bug to the maintainer of the actual application providing the UI. @@ -1739,6 +2550,6 @@ bug. .PP youtube\-dl is released into the public domain by the copyright holders. .PP -This README file was originally written by Daniel Bolton -(<https://github.com/dbbolton>) and is likewise released into the public -domain. +This README file was originally written by Daniel +Bolton (https://github.com/dbbolton) and is likewise released into the +public domain.