X-Git-Url: https://git.rapsys.eu/.gitweb.cgi/youtubedl/blobdiff_plain/233624c1db781ee7dabbaf88453cf18e248dd20d..32569172c58bbc32bcc4fdb64af8615f98c4cccb:/youtube-dl.1?ds=sidebyside diff --git a/youtube-dl.1 b/youtube-dl.1 index 61ee72f..cb8f218 100644 --- a/youtube-dl.1 +++ b/youtube-dl.1 @@ -1033,7 +1033,7 @@ formatting operations (https://docs.python.org/2/library/stdtypes.html#string-formatting). For example, \f[C]%(NAME)s\f[] or \f[C]%(NAME)05d\f[]. To clarify, that is a percent symbol followed by a name in parentheses, -followed by a formatting operations. +followed by formatting operations. Allowed names along with sequence type are: .IP \[bu] 2 \f[C]id\f[] (string): Video identifier @@ -1064,6 +1064,11 @@ became available .IP \[bu] 2 \f[C]uploader_id\f[] (string): Nickname or id of the video uploader .IP \[bu] 2 +\f[C]channel\f[] (string): Full name of the channel the video is +uploaded on +.IP \[bu] 2 +\f[C]channel_id\f[] (string): Id of the channel +.IP \[bu] 2 \f[C]location\f[] (string): Physical location where the video was filmed .IP \[bu] 2 \f[C]duration\f[] (numeric): Length of the video in seconds @@ -2086,15 +2091,24 @@ Have a look at \f[C]youtube_dl/extractor/common.py\f[] (https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py) for possible helper methods and a detailed description of what your extractor should and may -return (https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py#L74-L252). +return (https://github.com/rg3/youtube-dl/blob/7f41a598b3fba1bcab2817de64a08941200aa3c8/youtube_dl/extractor/common.py#L94-L303). Add tests and code for as many as you want. .IP " 8." 4 Make sure your code follows youtube\-dl coding conventions and check the -code with flake8 (https://pypi.python.org/pypi/flake8). -Also make sure your code works under all -Python (https://www.python.org/) versions claimed supported by -youtube\-dl, namely 2.6, 2.7, and 3.2+. +code with +flake8 (http://flake8.pycqa.org/en/latest/index.html#quickstart): +.RS 4 +.IP +.nf +\f[C] +\ $\ flake8\ youtube_dl/extractor/yourextractor.py +\f[] +.fi +.RE .IP " 9." 4 +Make sure your code works under all Python (https://www.python.org/) +versions claimed supported by youtube\-dl, namely 2.6, 2.7, and 3.2+. +.IP "10." 4 When the tests pass, add (https://git-scm.com/docs/git-add) the new files and commit (https://git-scm.com/docs/git-commit) them and push (https://git-scm.com/docs/git-push) the result, like this: @@ -2102,14 +2116,14 @@ push (https://git-scm.com/docs/git-push) the result, like this: .IP .nf \f[C] -\ $\ git\ add\ youtube_dl/extractor/extractors.py -\ $\ git\ add\ youtube_dl/extractor/yourextractor.py -\ $\ git\ commit\ \-m\ \[aq][yourextractor]\ Add\ new\ extractor\[aq] -\ $\ git\ push\ origin\ yourextractor +$\ git\ add\ youtube_dl/extractor/extractors.py +$\ git\ add\ youtube_dl/extractor/yourextractor.py +$\ git\ commit\ \-m\ \[aq][yourextractor]\ Add\ new\ extractor\[aq] +$\ git\ push\ origin\ yourextractor \f[] .fi .RE -.IP "10." 4 +.IP "11." 4 Finally, create a pull request (https://help.github.com/articles/creating-a-pull-request). We\[aq]ll then review and merge it. @@ -2139,7 +2153,7 @@ update at all. .PP For extraction to work youtube\-dl relies on metadata your extractor extracts and provides to youtube\-dl expressed by an information -dictionary (https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py#L75-L257) +dictionary (https://github.com/rg3/youtube-dl/blob/7f41a598b3fba1bcab2817de64a08941200aa3c8/youtube_dl/extractor/common.py#L94-L303) or simply \f[I]info dict\f[]. Only the following meta fields in the \f[I]info dict\f[] are considered mandatory for a successful extraction process by youtube\-dl: @@ -2160,7 +2174,7 @@ extraction does not make any sense without and if any of them fail to be extracted then the extractor is considered completely broken. .PP Any -field (https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py#L149-L257) +field (https://github.com/rg3/youtube-dl/blob/7f41a598b3fba1bcab2817de64a08941200aa3c8/youtube_dl/extractor/common.py#L188-L303) apart from the aforementioned ones are considered \f[B]optional\f[]. That means that extraction should be \f[B]tolerant\f[] to situations when sources for these fields can potentially be unavailable (even if @@ -2281,9 +2295,37 @@ title\ =\ meta.get(\[aq]title\[aq])\ or\ self._og_search_title(webpage) .PP This code will try to extract from \f[C]meta\f[] first and if it fails it will try extracting \f[C]og:title\f[] from a \f[C]webpage\f[]. -.SS Make regular expressions flexible +.SS Regular expressions +.SS Don\[aq]t capture groups you don\[aq]t use .PP -When using regular expressions try to write them fuzzy and flexible. +Capturing group must be an indication that it\[aq]s used somewhere in +the code. +Any group that is not used must be non capturing. +.SS Example +.PP +Don\[aq]t capture id attribute name here since you can\[aq]t use it for +anything anyway. +.PP +Correct: +.IP +.nf +\f[C] +r\[aq](?:id|ID)=(?P\\d+)\[aq] +\f[] +.fi +.PP +Incorrect: +.IP +.nf +\f[C] +r\[aq](id|ID)=(?P\\d+)\[aq] +\f[] +.fi +.SS Make regular expressions relaxed and flexible +.PP +When using regular expressions try to write them fuzzy, relaxed and +flexible, skipping insignificant parts that are more likely to change, +allowing both single and double quotes for quoted values and so on. .SS Example .PP Say you need to extract \f[C]title\f[] from the following HTML code: @@ -2326,11 +2368,64 @@ title\ =\ self._search_regex( \ \ \ \ webpage,\ \[aq]title\[aq],\ group=\[aq]title\[aq]) \f[] .fi +.SS Long lines policy +.PP +There is a soft limit to keep lines of code under 80 characters long. +This means it should be respected if possible and if it does not make +readability and code maintenance worse. +.PP +For example, you should \f[B]never\f[] split long string literals like +URLs or some other often copied entities over multiple lines to fit this +limit: +.PP +Correct: +.IP +.nf +\f[C] +\[aq]https://www.youtube.com/watch?v=FqZTN594JQw&list=PLMYEtVRpaqY00V9W81Cwmzp6N6vZqfUKD4\[aq] +\f[] +.fi +.PP +Incorrect: +.IP +.nf +\f[C] +\[aq]https://www.youtube.com/watch?v=FqZTN594JQw&list=\[aq] +\[aq]PLMYEtVRpaqY00V9W81Cwmzp6N6vZqfUKD4\[aq] +\f[] +.fi .SS Use safe conversion functions .PP -Wrap all extracted numeric data into safe functions from \f[C]utils\f[]: +Wrap all extracted numeric data into safe functions from +\f[C]youtube_dl/utils.py\f[] (https://github.com/rg3/youtube-dl/blob/master/youtube_dl/utils.py): \f[C]int_or_none\f[], \f[C]float_or_none\f[]. Use them for string to number conversions as well. +.PP +Use \f[C]url_or_none\f[] for safe URL processing. +.PP +Use \f[C]try_get\f[] for safe metadata extraction from parsed JSON. +.PP +Explore +\f[C]youtube_dl/utils.py\f[] (https://github.com/rg3/youtube-dl/blob/master/youtube_dl/utils.py) +for more useful convenience functions. +.SS More examples +.SS Safely extract optional description from parsed JSON +.IP +.nf +\f[C] +description\ =\ try_get(response,\ lambda\ x:\ x[\[aq]result\[aq]][\[aq]video\[aq]][0][\[aq]summary\[aq]],\ compat_str) +\f[] +.fi +.SS Safely extract more optional metadata +.IP +.nf +\f[C] +video\ =\ try_get(response,\ lambda\ x:\ x[\[aq]result\[aq]][\[aq]video\[aq]][0],\ dict)\ or\ {} +description\ =\ video.get(\[aq]summary\[aq]) +duration\ =\ float_or_none(video.get(\[aq]durationMs\[aq]),\ scale=1000) +view_count\ =\ int_or_none(video.get(\[aq]views\[aq])) +\f[] +.fi .SH EMBEDDING YOUTUBE\-DL .PP youtube\-dl makes the best effort to be a good command\-line program,