X-Git-Url: https://git.rapsys.eu/youtubedl/blobdiff_plain/5920ef2b4969021b7f83d154b325036d9b598877..ea20b361fd3f38ca4458e470b0152ff3d9e80e6d:/youtube-dl.1?ds=inline diff --git a/youtube-dl.1 b/youtube-dl.1 index 8ce6630..cb8f218 100644 --- a/youtube-dl.1 +++ b/youtube-dl.1 @@ -1033,7 +1033,7 @@ formatting operations (https://docs.python.org/2/library/stdtypes.html#string-formatting). For example, \f[C]%(NAME)s\f[] or \f[C]%(NAME)05d\f[]. To clarify, that is a percent symbol followed by a name in parentheses, -followed by a formatting operations. +followed by formatting operations. Allowed names along with sequence type are: .IP \[bu] 2 \f[C]id\f[] (string): Video identifier @@ -2091,15 +2091,24 @@ Have a look at \f[C]youtube_dl/extractor/common.py\f[] (https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py) for possible helper methods and a detailed description of what your extractor should and may -return (https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py#L74-L252). +return (https://github.com/rg3/youtube-dl/blob/7f41a598b3fba1bcab2817de64a08941200aa3c8/youtube_dl/extractor/common.py#L94-L303). Add tests and code for as many as you want. .IP " 8." 4 Make sure your code follows youtube\-dl coding conventions and check the -code with flake8 (https://pypi.python.org/pypi/flake8). -Also make sure your code works under all -Python (https://www.python.org/) versions claimed supported by -youtube\-dl, namely 2.6, 2.7, and 3.2+. +code with +flake8 (http://flake8.pycqa.org/en/latest/index.html#quickstart): +.RS 4 +.IP +.nf +\f[C] +\ $\ flake8\ youtube_dl/extractor/yourextractor.py +\f[] +.fi +.RE .IP " 9." 4 +Make sure your code works under all Python (https://www.python.org/) +versions claimed supported by youtube\-dl, namely 2.6, 2.7, and 3.2+. +.IP "10." 4 When the tests pass, add (https://git-scm.com/docs/git-add) the new files and commit (https://git-scm.com/docs/git-commit) them and push (https://git-scm.com/docs/git-push) the result, like this: @@ -2107,14 +2116,14 @@ push (https://git-scm.com/docs/git-push) the result, like this: .IP .nf \f[C] -\ $\ git\ add\ youtube_dl/extractor/extractors.py -\ $\ git\ add\ youtube_dl/extractor/yourextractor.py -\ $\ git\ commit\ \-m\ \[aq][yourextractor]\ Add\ new\ extractor\[aq] -\ $\ git\ push\ origin\ yourextractor +$\ git\ add\ youtube_dl/extractor/extractors.py +$\ git\ add\ youtube_dl/extractor/yourextractor.py +$\ git\ commit\ \-m\ \[aq][yourextractor]\ Add\ new\ extractor\[aq] +$\ git\ push\ origin\ yourextractor \f[] .fi .RE -.IP "10." 4 +.IP "11." 4 Finally, create a pull request (https://help.github.com/articles/creating-a-pull-request). We\[aq]ll then review and merge it. @@ -2144,7 +2153,7 @@ update at all. .PP For extraction to work youtube\-dl relies on metadata your extractor extracts and provides to youtube\-dl expressed by an information -dictionary (https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py#L75-L257) +dictionary (https://github.com/rg3/youtube-dl/blob/7f41a598b3fba1bcab2817de64a08941200aa3c8/youtube_dl/extractor/common.py#L94-L303) or simply \f[I]info dict\f[]. Only the following meta fields in the \f[I]info dict\f[] are considered mandatory for a successful extraction process by youtube\-dl: @@ -2165,7 +2174,7 @@ extraction does not make any sense without and if any of them fail to be extracted then the extractor is considered completely broken. .PP Any -field (https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py#L149-L257) +field (https://github.com/rg3/youtube-dl/blob/7f41a598b3fba1bcab2817de64a08941200aa3c8/youtube_dl/extractor/common.py#L188-L303) apart from the aforementioned ones are considered \f[B]optional\f[]. That means that extraction should be \f[B]tolerant\f[] to situations when sources for these fields can potentially be unavailable (even if @@ -2286,9 +2295,37 @@ title\ =\ meta.get(\[aq]title\[aq])\ or\ self._og_search_title(webpage) .PP This code will try to extract from \f[C]meta\f[] first and if it fails it will try extracting \f[C]og:title\f[] from a \f[C]webpage\f[]. -.SS Make regular expressions flexible +.SS Regular expressions +.SS Don\[aq]t capture groups you don\[aq]t use +.PP +Capturing group must be an indication that it\[aq]s used somewhere in +the code. +Any group that is not used must be non capturing. +.SS Example +.PP +Don\[aq]t capture id attribute name here since you can\[aq]t use it for +anything anyway. +.PP +Correct: +.IP +.nf +\f[C] +r\[aq](?:id|ID)=(?P\\d+)\[aq] +\f[] +.fi +.PP +Incorrect: +.IP +.nf +\f[C] +r\[aq](id|ID)=(?P\\d+)\[aq] +\f[] +.fi +.SS Make regular expressions relaxed and flexible .PP -When using regular expressions try to write them fuzzy and flexible. +When using regular expressions try to write them fuzzy, relaxed and +flexible, skipping insignificant parts that are more likely to change, +allowing both single and double quotes for quoted values and so on. .SS Example .PP Say you need to extract \f[C]title\f[] from the following HTML code: @@ -2331,6 +2368,32 @@ title\ =\ self._search_regex( \ \ \ \ webpage,\ \[aq]title\[aq],\ group=\[aq]title\[aq]) \f[] .fi +.SS Long lines policy +.PP +There is a soft limit to keep lines of code under 80 characters long. +This means it should be respected if possible and if it does not make +readability and code maintenance worse. +.PP +For example, you should \f[B]never\f[] split long string literals like +URLs or some other often copied entities over multiple lines to fit this +limit: +.PP +Correct: +.IP +.nf +\f[C] +\[aq]https://www.youtube.com/watch?v=FqZTN594JQw&list=PLMYEtVRpaqY00V9W81Cwmzp6N6vZqfUKD4\[aq] +\f[] +.fi +.PP +Incorrect: +.IP +.nf +\f[C] +\[aq]https://www.youtube.com/watch?v=FqZTN594JQw&list=\[aq] +\[aq]PLMYEtVRpaqY00V9W81Cwmzp6N6vZqfUKD4\[aq] +\f[] +.fi .SS Use safe conversion functions .PP Wrap all extracted numeric data into safe functions from