+make (only GNU make is supported)
+.IP \[bu] 2
+pandoc
+.IP \[bu] 2
+zip
+.IP \[bu] 2
+nosetests
+.SS Adding support for a new site
+.PP
+If you want to add support for a new site, first of all \f[B]make
+sure\f[] this site is \f[B]not dedicated to copyright
+infringement (README.md#can-you-add-support-for-this-anime-video-site-or-site-which-shows-current-movies-for-free)\f[].
+youtube\-dl does \f[B]not support\f[] such sites thus pull requests
+adding support for them \f[B]will be rejected\f[].
+.PP
+After you have ensured this site is distributing its content legally,
+you can follow this quick list (assuming your service is called
+\f[C]yourextractor\f[]):
+.IP " 1." 4
+Fork this repository (https://github.com/rg3/youtube-dl/fork)
+.IP " 2." 4
+Check out the source code with:
+.RS 4
+.IP
+.nf
+\f[C]
+git\ clone\ git\@github.com:YOUR_GITHUB_USERNAME/youtube\-dl.git
+\f[]
+.fi
+.RE
+.IP " 3." 4
+Start a new git branch with
+.RS 4
+.IP
+.nf
+\f[C]
+cd\ youtube\-dl
+git\ checkout\ \-b\ yourextractor
+\f[]
+.fi
+.RE
+.IP " 4." 4
+Start with this simple template and save it to
+\f[C]youtube_dl/extractor/yourextractor.py\f[]:
+.RS 4
+.IP
+.nf
+\f[C]
+#\ coding:\ utf\-8
+from\ __future__\ import\ unicode_literals
+
+from\ .common\ import\ InfoExtractor
+
+
+class\ YourExtractorIE(InfoExtractor):
+\ \ \ \ _VALID_URL\ =\ r\[aq]https?://(?:www\\.)?yourextractor\\.com/watch/(?P<id>[0\-9]+)\[aq]
+\ \ \ \ _TEST\ =\ {
+\ \ \ \ \ \ \ \ \[aq]url\[aq]:\ \[aq]http://yourextractor.com/watch/42\[aq],
+\ \ \ \ \ \ \ \ \[aq]md5\[aq]:\ \[aq]TODO:\ md5\ sum\ of\ the\ first\ 10241\ bytes\ of\ the\ video\ file\ (use\ \-\-test)\[aq],
+\ \ \ \ \ \ \ \ \[aq]info_dict\[aq]:\ {
+\ \ \ \ \ \ \ \ \ \ \ \ \[aq]id\[aq]:\ \[aq]42\[aq],
+\ \ \ \ \ \ \ \ \ \ \ \ \[aq]ext\[aq]:\ \[aq]mp4\[aq],
+\ \ \ \ \ \ \ \ \ \ \ \ \[aq]title\[aq]:\ \[aq]Video\ title\ goes\ here\[aq],
+\ \ \ \ \ \ \ \ \ \ \ \ \[aq]thumbnail\[aq]:\ r\[aq]re:^https?://.*\\.jpg$\[aq],
+\ \ \ \ \ \ \ \ \ \ \ \ #\ TODO\ more\ properties,\ either\ as:
+\ \ \ \ \ \ \ \ \ \ \ \ #\ *\ A\ value
+\ \ \ \ \ \ \ \ \ \ \ \ #\ *\ MD5\ checksum;\ start\ the\ string\ with\ md5:
+\ \ \ \ \ \ \ \ \ \ \ \ #\ *\ A\ regular\ expression;\ start\ the\ string\ with\ re:
+\ \ \ \ \ \ \ \ \ \ \ \ #\ *\ Any\ Python\ type\ (for\ example\ int\ or\ float)
+\ \ \ \ \ \ \ \ }
+\ \ \ \ }
+
+\ \ \ \ def\ _real_extract(self,\ url):
+\ \ \ \ \ \ \ \ video_id\ =\ self._match_id(url)
+\ \ \ \ \ \ \ \ webpage\ =\ self._download_webpage(url,\ video_id)
+
+\ \ \ \ \ \ \ \ #\ TODO\ more\ code\ goes\ here,\ for\ example\ ...
+\ \ \ \ \ \ \ \ title\ =\ self._html_search_regex(r\[aq]<h1>(.+?)</h1>\[aq],\ webpage,\ \[aq]title\[aq])
+
+\ \ \ \ \ \ \ \ return\ {
+\ \ \ \ \ \ \ \ \ \ \ \ \[aq]id\[aq]:\ video_id,
+\ \ \ \ \ \ \ \ \ \ \ \ \[aq]title\[aq]:\ title,
+\ \ \ \ \ \ \ \ \ \ \ \ \[aq]description\[aq]:\ self._og_search_description(webpage),
+\ \ \ \ \ \ \ \ \ \ \ \ \[aq]uploader\[aq]:\ self._search_regex(r\[aq]<div[^>]+id="uploader"[^>]*>([^<]+)<\[aq],\ webpage,\ \[aq]uploader\[aq],\ fatal=False),
+\ \ \ \ \ \ \ \ \ \ \ \ #\ TODO\ more\ properties\ (see\ youtube_dl/extractor/common.py)
+\ \ \ \ \ \ \ \ }
+\f[]
+.fi
+.RE
+.IP " 5." 4
+Add an import in
+\f[C]youtube_dl/extractor/extractors.py\f[] (https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/extractors.py).
+.IP " 6." 4
+Run
+\f[C]python\ test/test_download.py\ TestDownload.test_YourExtractor\f[].
+This \f[I]should fail\f[] at first, but you can continually re\-run it
+until you\[aq]re done.
+If you decide to add more than one test, then rename \f[C]_TEST\f[] to
+\f[C]_TESTS\f[] and make it into a list of dictionaries.
+The tests will then be named \f[C]TestDownload.test_YourExtractor\f[],
+\f[C]TestDownload.test_YourExtractor_1\f[],
+\f[C]TestDownload.test_YourExtractor_2\f[], etc.
+.IP " 7." 4
+Have a look at
+\f[C]youtube_dl/extractor/common.py\f[] (https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py)
+for possible helper methods and a detailed description of what your
+extractor should and may
+return (https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py#L74-L252).
+Add tests and code for as many as you want.
+.IP " 8." 4
+Make sure your code follows youtube\-dl coding
+conventions (#youtube-dl-coding-conventions) and check the code with
+flake8 (https://pypi.python.org/pypi/flake8).
+Also make sure your code works under all Python (http://www.python.org/)
+versions claimed supported by youtube\-dl, namely 2.6, 2.7, and 3.2+.
+.IP " 9." 4
+When the tests pass, add (http://git-scm.com/docs/git-add) the new files
+and commit (http://git-scm.com/docs/git-commit) them and
+push (http://git-scm.com/docs/git-push) the result, like this:
+.RS 4
+.IP
+.nf
+\f[C]
+$\ git\ add\ youtube_dl/extractor/extractors.py
+$\ git\ add\ youtube_dl/extractor/yourextractor.py
+$\ git\ commit\ \-m\ \[aq][yourextractor]\ Add\ new\ extractor\[aq]
+$\ git\ push\ origin\ yourextractor
+\f[]
+.fi
+.RE
+.IP "10." 4
+Finally, create a pull
+request (https://help.github.com/articles/creating-a-pull-request).
+We\[aq]ll then review and merge it.
+.PP
+In any case, thank you very much for your contributions!
+.SS youtube\-dl coding conventions
+.PP
+This section introduces a guide lines for writing idiomatic, robust and
+future\-proof extractor code.
+.PP
+Extractors are very fragile by nature since they depend on the layout of
+the source data provided by 3rd party media hosters out of your control
+and this layout tends to change.
+As an extractor implementer your task is not only to write code that
+will extract media links and metadata correctly but also to minimize
+dependency on the source\[aq]s layout and even to make the code foresee
+potential future changes and be ready for that.
+This is important because it will allow the extractor not to break on
+minor layout changes thus keeping old youtube\-dl versions working.
+Even though this breakage issue is easily fixed by emitting a new
+version of youtube\-dl with a fix incorporated, all the previous
+versions become broken in all repositories and distros\[aq] packages
+that may not be so prompt in fetching the update from us.
+Needless to say, some non rolling release distros may never receive an
+update at all.
+.SS Mandatory and optional metafields
+.PP
+For extraction to work youtube\-dl relies on metadata your extractor
+extracts and provides to youtube\-dl expressed by an information
+dictionary (https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py#L75-L257)
+or simply \f[I]info dict\f[].
+Only the following meta fields in the \f[I]info dict\f[] are considered
+mandatory for a successful extraction process by youtube\-dl:
+.IP \[bu] 2
+\f[C]id\f[] (media identifier)
+.IP \[bu] 2
+\f[C]title\f[] (media title)
+.IP \[bu] 2
+\f[C]url\f[] (media download URL) or \f[C]formats\f[]
+.PP
+In fact only the last option is technically mandatory (i.e.
+if you can\[aq]t figure out the download location of the media the
+extraction does not make any sense).
+But by convention youtube\-dl also treats \f[C]id\f[] and \f[C]title\f[]
+as mandatory.
+Thus the aforementioned metafields are the critical data that the
+extraction does not make any sense without and if any of them fail to be
+extracted then the extractor is considered completely broken.
+.PP
+Any
+field (https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py#L149-L257)
+apart from the aforementioned ones are considered \f[B]optional\f[].
+That means that extraction should be \f[B]tolerant\f[] to situations
+when sources for these fields can potentially be unavailable (even if
+they are always available at the moment) and \f[B]future\-proof\f[] in
+order not to break the extraction of general purpose mandatory fields.
+.SS Example
+.PP
+Say you have some source dictionary \f[C]meta\f[] that you\[aq]ve
+fetched as JSON with HTTP request and it has a key \f[C]summary\f[]:
+.IP
+.nf
+\f[C]
+meta\ =\ self._download_json(url,\ video_id)
+\f[]
+.fi
+.PP
+Assume at this point \f[C]meta\f[]\[aq]s layout is:
+.IP
+.nf
+\f[C]
+{
+\ \ \ \ ...
+\ \ \ \ "summary":\ "some\ fancy\ summary\ text",
+\ \ \ \ ...
+}
+\f[]
+.fi
+.PP
+Assume you want to extract \f[C]summary\f[] and put it into the
+resulting info dict as \f[C]description\f[].
+Since \f[C]description\f[] is an optional meta field you should be ready
+that this key may be missing from the \f[C]meta\f[] dict, so that you
+should extract it like:
+.IP
+.nf
+\f[C]
+description\ =\ meta.get(\[aq]summary\[aq])\ \ #\ correct
+\f[]
+.fi
+.PP
+and not like:
+.IP
+.nf
+\f[C]
+description\ =\ meta[\[aq]summary\[aq]]\ \ #\ incorrect
+\f[]
+.fi
+.PP
+The latter will break extraction process with \f[C]KeyError\f[] if
+\f[C]summary\f[] disappears from \f[C]meta\f[] at some later time but
+with the former approach extraction will just go ahead with
+\f[C]description\f[] set to \f[C]None\f[] which is perfectly fine
+(remember \f[C]None\f[] is equivalent to the absence of data).
+.PP
+Similarly, you should pass \f[C]fatal=False\f[] when extracting optional
+data from a webpage with \f[C]_search_regex\f[],
+\f[C]_html_search_regex\f[] or similar methods, for instance:
+.IP
+.nf
+\f[C]
+description\ =\ self._search_regex(
+\ \ \ \ r\[aq]<span[^>]+id="title"[^>]*>([^<]+)<\[aq],
+\ \ \ \ webpage,\ \[aq]description\[aq],\ fatal=False)
+\f[]
+.fi
+.PP
+With \f[C]fatal\f[] set to \f[C]False\f[] if \f[C]_search_regex\f[]
+fails to extract \f[C]description\f[] it will emit a warning and
+continue extraction.
+.PP
+You can also pass \f[C]default=<some\ fallback\ value>\f[], for example:
+.IP
+.nf
+\f[C]
+description\ =\ self._search_regex(
+\ \ \ \ r\[aq]<span[^>]+id="title"[^>]*>([^<]+)<\[aq],
+\ \ \ \ webpage,\ \[aq]description\[aq],\ default=None)
+\f[]
+.fi
+.PP
+On failure this code will silently continue the extraction with
+\f[C]description\f[] set to \f[C]None\f[].
+That is useful for metafields that may or may not be present.
+.SS Provide fallbacks
+.PP
+When extracting metadata try to do so from multiple sources.
+For example if \f[C]title\f[] is present in several places, try
+extracting from at least some of them.
+This makes it more future\-proof in case some of the sources become
+unavailable.
+.SS Example
+.PP
+Say \f[C]meta\f[] from the previous example has a \f[C]title\f[] and you
+are about to extract it.
+Since \f[C]title\f[] is a mandatory meta field you should end up with
+something like:
+.IP
+.nf
+\f[C]
+title\ =\ meta[\[aq]title\[aq]]
+\f[]
+.fi
+.PP
+If \f[C]title\f[] disappears from \f[C]meta\f[] in future due to some
+changes on the hoster\[aq]s side the extraction would fail since
+\f[C]title\f[] is mandatory.
+That\[aq]s expected.
+.PP
+Assume that you have some another source you can extract \f[C]title\f[]
+from, for example \f[C]og:title\f[] HTML meta of a \f[C]webpage\f[].
+In this case you can provide a fallback scenario:
+.IP
+.nf
+\f[C]
+title\ =\ meta.get(\[aq]title\[aq])\ or\ self._og_search_title(webpage)
+\f[]
+.fi
+.PP
+This code will try to extract from \f[C]meta\f[] first and if it fails
+it will try extracting \f[C]og:title\f[] from a \f[C]webpage\f[].
+.SS Make regular expressions flexible
+.PP
+When using regular expressions try to write them fuzzy and flexible.
+.SS Example
+.PP
+Say you need to extract \f[C]title\f[] from the following HTML code:
+.IP
+.nf
+\f[C]
+<span\ style="position:\ absolute;\ left:\ 910px;\ width:\ 90px;\ float:\ right;\ z\-index:\ 9999;"\ class="title">some\ fancy\ title</span>
+\f[]
+.fi
+.PP
+The code for that task should look similar to:
+.IP
+.nf
+\f[C]
+title\ =\ self._search_regex(
+\ \ \ \ r\[aq]<span[^>]+class="title"[^>]*>([^<]+)\[aq],\ webpage,\ \[aq]title\[aq])
+\f[]
+.fi
+.PP
+Or even better:
+.IP
+.nf
+\f[C]
+title\ =\ self._search_regex(
+\ \ \ \ r\[aq]<span[^>]+class=(["\\\[aq]])title\\1[^>]*>(?P<title>[^<]+)\[aq],
+\ \ \ \ webpage,\ \[aq]title\[aq],\ group=\[aq]title\[aq])
+\f[]
+.fi
+.PP
+Note how you tolerate potential changes in the \f[C]style\f[]
+attribute\[aq]s value or switch from using double quotes to single for
+\f[C]class\f[] attribute:
+.PP
+The code definitely should not look like:
+.IP
+.nf
+\f[C]
+title\ =\ self._search_regex(
+\ \ \ \ r\[aq]<span\ style="position:\ absolute;\ left:\ 910px;\ width:\ 90px;\ float:\ right;\ z\-index:\ 9999;"\ class="title">(.*?)</span>\[aq],
+\ \ \ \ webpage,\ \[aq]title\[aq],\ group=\[aq]title\[aq])
+\f[]
+.fi
+.SS Use safe conversion functions
+.PP
+Wrap all extracted numeric data into safe functions from \f[C]utils\f[]:
+\f[C]int_or_none\f[], \f[C]float_or_none\f[].
+Use them for string to number conversions as well.
+.SH EMBEDDING YOUTUBE\-DL
+.PP
+youtube\-dl makes the best effort to be a good command\-line program,
+and thus should be callable from any programming language.
+If you encounter any problems parsing its output, feel free to create a
+report (https://github.com/rg3/youtube-dl/issues/new).
+.PP
+From a Python program, you can embed youtube\-dl in a more powerful
+fashion, like this:
+.IP
+.nf
+\f[C]
+from\ __future__\ import\ unicode_literals
+import\ youtube_dl
+
+ydl_opts\ =\ {}
+with\ youtube_dl.YoutubeDL(ydl_opts)\ as\ ydl:
+\ \ \ \ ydl.download([\[aq]http://www.youtube.com/watch?v=BaW_jenozKc\[aq]])
+\f[]
+.fi
+.PP
+Most likely, you\[aq]ll want to use various options.
+For a list of options available, have a look at
+\f[C]youtube_dl/YoutubeDL.py\f[] (https://github.com/rg3/youtube-dl/blob/master/youtube_dl/YoutubeDL.py#L129-L279).
+For a start, if you want to intercept youtube\-dl\[aq]s output, set a
+\f[C]logger\f[] object.
+.PP
+Here\[aq]s a more complete example of a program that outputs only errors
+(and a short message after the download is finished), and
+downloads/converts the video to an mp3 file:
+.IP
+.nf
+\f[C]
+from\ __future__\ import\ unicode_literals
+import\ youtube_dl
+
+
+class\ MyLogger(object):
+\ \ \ \ def\ debug(self,\ msg):
+\ \ \ \ \ \ \ \ pass
+
+\ \ \ \ def\ warning(self,\ msg):
+\ \ \ \ \ \ \ \ pass
+
+\ \ \ \ def\ error(self,\ msg):
+\ \ \ \ \ \ \ \ print(msg)
+
+
+def\ my_hook(d):
+\ \ \ \ if\ d[\[aq]status\[aq]]\ ==\ \[aq]finished\[aq]:
+\ \ \ \ \ \ \ \ print(\[aq]Done\ downloading,\ now\ converting\ ...\[aq])
+
+
+ydl_opts\ =\ {
+\ \ \ \ \[aq]format\[aq]:\ \[aq]bestaudio/best\[aq],
+\ \ \ \ \[aq]postprocessors\[aq]:\ [{
+\ \ \ \ \ \ \ \ \[aq]key\[aq]:\ \[aq]FFmpegExtractAudio\[aq],
+\ \ \ \ \ \ \ \ \[aq]preferredcodec\[aq]:\ \[aq]mp3\[aq],
+\ \ \ \ \ \ \ \ \[aq]preferredquality\[aq]:\ \[aq]192\[aq],
+\ \ \ \ }],
+\ \ \ \ \[aq]logger\[aq]:\ MyLogger(),
+\ \ \ \ \[aq]progress_hooks\[aq]:\ [my_hook],
+}
+with\ youtube_dl.YoutubeDL(ydl_opts)\ as\ ydl:
+\ \ \ \ ydl.download([\[aq]http://www.youtube.com/watch?v=BaW_jenozKc\[aq]])
+\f[]
+.fi
+.SH BUGS
+.PP
+Bugs and suggestions should be reported at:
+<https://github.com/rg3/youtube-dl/issues>.
+Unless you were prompted to or there is another pertinent reason (e.g.
+GitHub fails to accept the bug report), please do not send bug reports
+via personal email.
+For discussions, join us in the IRC channel
+#youtube\-dl (irc://chat.freenode.net/#youtube-dl) on freenode
+(webchat (http://webchat.freenode.net/?randomnick=1&channels=youtube-dl)).
+.PP
+\f[B]Please include the full output of youtube\-dl when run with
+\f[C]\-v\f[]\f[], i.e.
+\f[B]add\f[] \f[C]\-v\f[] flag to \f[B]your command line\f[], copy the
+\f[B]whole\f[] output and post it in the issue body wrapped in ``` for
+better formatting.
+It should look similar to this:
+.IP
+.nf
+\f[C]
+$\ youtube\-dl\ \-v\ <your\ command\ line>
+[debug]\ System\ config:\ []
+[debug]\ User\ config:\ []
+[debug]\ Command\-line\ args:\ [u\[aq]\-v\[aq],\ u\[aq]http://www.youtube.com/watch?v=BaW_jenozKcj\[aq]]
+[debug]\ Encodings:\ locale\ cp1251,\ fs\ mbcs,\ out\ cp866,\ pref\ cp1251
+[debug]\ youtube\-dl\ version\ 2015.12.06
+[debug]\ Git\ HEAD:\ 135392e
+[debug]\ Python\ version\ 2.6.6\ \-\ Windows\-2003Server\-5.2.3790\-SP2
+[debug]\ exe\ versions:\ ffmpeg\ N\-75573\-g1d0487f,\ ffprobe\ N\-75573\-g1d0487f,\ rtmpdump\ 2.4
+[debug]\ Proxy\ map:\ {}
+\&...
+\f[]
+.fi
+.PP
+\f[B]Do not post screenshots of verbose logs; only plain text is
+acceptable.\f[]
+.PP
+The output (including the first lines) contains important debugging
+information.
+Issues without the full output are often not reproducible and therefore
+do not get solved in short order, if ever.
+.PP
+Please re\-read your issue once again to avoid a couple of common
+mistakes (you can and should use this as a checklist):
+.SS Is the description of the issue itself sufficient?
+.PP
+We often get issue reports that we cannot really decipher.
+While in most cases we eventually get the required information after
+asking back multiple times, this poses an unnecessary drain on our
+resources.
+Many contributors, including myself, are also not native speakers, so we
+may misread some parts.
+.PP
+So please elaborate on what feature you are requesting, or what bug you
+want to be fixed.
+Make sure that it\[aq]s obvious