+++ /dev/null
-## Please follow the guide below
-
-- You will be asked some questions and requested to provide some information, please read them **carefully** and answer honestly
-- Put an `x` into all the boxes [ ] relevant to your *issue* (like that [x])
-- Use *Preview* tab to see how your issue will actually look like
-
----
-
-### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2016.12.01*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
-- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2016.12.01**
-
-### Before submitting an *issue* make sure you have:
-- [ ] At least skimmed through [README](https://github.com/rg3/youtube-dl/blob/master/README.md) and **most notably** [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
-- [ ] [Searched](https://github.com/rg3/youtube-dl/search?type=Issues) the bugtracker for similar issues including closed ones
-
-### What is the purpose of your *issue*?
-- [ ] Bug report (encountered problems with youtube-dl)
-- [ ] Site support request (request for adding support for a new site)
-- [ ] Feature request (request for a new functionality)
-- [ ] Question
-- [ ] Other
-
----
-
-### The following sections concretize particular purposed issues, you can erase any section (the contents between triple ---) not applicable to your *issue*
-
----
-
-### If the purpose of this *issue* is a *bug report*, *site support request* or you are not completely sure provide the full verbose output as follows:
-
-Add `-v` flag to **your command line** you run youtube-dl with, copy the **whole** output and insert it here. It should look similar to one below (replace it with **your** log inserted between triple ```):
-```
-$ youtube-dl -v <your command line>
-[debug] System config: []
-[debug] User config: []
-[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
-[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
-[debug] youtube-dl version 2016.12.01
-[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
-[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
-[debug] Proxy map: {}
-...
-<end of log>
-```
-
----
-
-### If the purpose of this *issue* is a *site support request* please provide all kinds of example URLs support for which should be included (replace following example URLs by **yours**):
-- Single video: https://www.youtube.com/watch?v=BaW_jenozKc
-- Single video: https://youtu.be/BaW_jenozKc
-- Playlist: https://www.youtube.com/playlist?list=PL4lCao7KL_QFVb7Iudeipvc2BCavECqzc
-
----
-
-### Description of your *issue*, suggested solution and other information
-
-Explanation of your *issue* in arbitrary form goes here. Please make sure the [description is worded well enough to be understood](https://github.com/rg3/youtube-dl#is-the-description-of-the-issue-itself-sufficient). Provide as much context and examples as possible.
-If work on your *issue* requires account credentials please provide them or explain how one can obtain them.
+++ /dev/null
-## Please follow the guide below
-
-- You will be asked some questions and requested to provide some information, please read them **carefully** and answer honestly
-- Put an `x` into all the boxes [ ] relevant to your *issue* (like that [x])
-- Use *Preview* tab to see how your issue will actually look like
-
----
-
-### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *%(version)s*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
-- [ ] I've **verified** and **I assure** that I'm running youtube-dl **%(version)s**
-
-### Before submitting an *issue* make sure you have:
-- [ ] At least skimmed through [README](https://github.com/rg3/youtube-dl/blob/master/README.md) and **most notably** [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
-- [ ] [Searched](https://github.com/rg3/youtube-dl/search?type=Issues) the bugtracker for similar issues including closed ones
-
-### What is the purpose of your *issue*?
-- [ ] Bug report (encountered problems with youtube-dl)
-- [ ] Site support request (request for adding support for a new site)
-- [ ] Feature request (request for a new functionality)
-- [ ] Question
-- [ ] Other
-
----
-
-### The following sections concretize particular purposed issues, you can erase any section (the contents between triple ---) not applicable to your *issue*
-
----
-
-### If the purpose of this *issue* is a *bug report*, *site support request* or you are not completely sure provide the full verbose output as follows:
-
-Add `-v` flag to **your command line** you run youtube-dl with, copy the **whole** output and insert it here. It should look similar to one below (replace it with **your** log inserted between triple ```):
-```
-$ youtube-dl -v <your command line>
-[debug] System config: []
-[debug] User config: []
-[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
-[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
-[debug] youtube-dl version %(version)s
-[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
-[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
-[debug] Proxy map: {}
-...
-<end of log>
-```
-
----
-
-### If the purpose of this *issue* is a *site support request* please provide all kinds of example URLs support for which should be included (replace following example URLs by **yours**):
-- Single video: https://www.youtube.com/watch?v=BaW_jenozKc
-- Single video: https://youtu.be/BaW_jenozKc
-- Playlist: https://www.youtube.com/playlist?list=PL4lCao7KL_QFVb7Iudeipvc2BCavECqzc
-
----
-
-### Description of your *issue*, suggested solution and other information
-
-Explanation of your *issue* in arbitrary form goes here. Please make sure the [description is worded well enough to be understood](https://github.com/rg3/youtube-dl#is-the-description-of-the-issue-itself-sufficient). Provide as much context and examples as possible.
-If work on your *issue* requires account credentials please provide them or explain how one can obtain them.
+++ /dev/null
-## Please follow the guide below
-
-- You will be asked some questions, please read them **carefully** and answer honestly
-- Put an `x` into all the boxes [ ] relevant to your *pull request* (like that [x])
-- Use *Preview* tab to see how your *pull request* will actually look like
-
----
-
-### Before submitting a *pull request* make sure you have:
-- [ ] At least skimmed through [adding new extractor tutorial](https://github.com/rg3/youtube-dl#adding-support-for-a-new-site) and [youtube-dl coding conventions](https://github.com/rg3/youtube-dl#youtube-dl-coding-conventions) sections
-- [ ] [Searched](https://github.com/rg3/youtube-dl/search?q=is%3Apr&type=Issues) the bugtracker for similar pull requests
-
-### In order to be accepted and merged into youtube-dl each piece of code must be in public domain or released under [Unlicense](http://unlicense.org/). Check one of the following options:
-- [ ] I am the original author of this code and I am willing to release it under [Unlicense](http://unlicense.org/)
-- [ ] I am not the original author of this code but it is in public domain or released under [Unlicense](http://unlicense.org/) (provide reliable evidence)
-
-### What is the purpose of your *pull request*?
-- [ ] Bug fix
-- [ ] Improvement
-- [ ] New extractor
-- [ ] New feature
-
----
-
-### Description of your *pull request* and other information
-
-Explanation of your *pull request* in arbitrary form goes here. Please make sure the description explains the purpose and effect of your *pull request* and is worded well enough to be understood. Provide as much context and examples as possible.
+++ /dev/null
-*.pyc
-*.pyo
-*.class
-*~
-*.DS_Store
-wine-py2exe/
-py2exe.log
-*.kate-swp
-build/
-dist/
-MANIFEST
-README.txt
-youtube-dl.1
-youtube-dl.bash-completion
-youtube-dl.fish
-youtube_dl/extractor/lazy_extractors.py
-youtube-dl
-youtube-dl.exe
-youtube-dl.tar.gz
-.coverage
-cover/
-updates_key.pem
-*.egg-info
-*.srt
-*.sbv
-*.vtt
-*.flv
-*.mp4
-*.m4a
-*.m4v
-*.mp3
-*.3gp
-*.wav
-*.ape
-*.mkv
-*.swf
-*.part
-*.swp
-test/testdata
-test/local_parameters.json
-.tox
-youtube-dl.zsh
-
-# IntelliJ related files
-.idea
-*.iml
-
-tmp/
+++ /dev/null
-language: python
-python:
- - "2.6"
- - "2.7"
- - "3.2"
- - "3.3"
- - "3.4"
- - "3.5"
-sudo: false
-script: nosetests test --verbose
-notifications:
- email:
- - filippo.valsorda@gmail.com
- - yasoob.khld@gmail.com
-# irc:
-# channels:
-# - "irc.freenode.org#youtube-dl"
-# skip_join: true
+++ /dev/null
-Ricardo Garcia Gonzalez
-Danny Colligan
-Benjamin Johnson
-Vasyl' Vavrychuk
-Witold Baryluk
-Paweł Paprota
-Gergely Imreh
-Rogério Brito
-Philipp Hagemeister
-Sören Schulze
-Kevin Ngo
-Ori Avtalion
-shizeeg
-Filippo Valsorda
-Christian Albrecht
-Dave Vasilevsky
-Jaime Marquínez Ferrándiz
-Jeff Crouse
-Osama Khalid
-Michael Walter
-M. Yasoob Ullah Khalid
-Julien Fraichard
-Johny Mo Swag
-Axel Noack
-Albert Kim
-Pierre Rudloff
-Huarong Huo
-Ismael Mejía
-Steffan Donal
-Andras Elso
-Jelle van der Waa
-Marcin Cieślak
-Anton Larionov
-Takuya Tsuchida
-Sergey M.
-Michael Orlitzky
-Chris Gahan
-Saimadhav Heblikar
-Mike Col
-Oleg Prutz
-pulpe
-Andreas Schmitz
-Michael Kaiser
-Niklas Laxström
-David Triendl
-Anthony Weems
-David Wagner
-Juan C. Olivares
-Mattias Harrysson
-phaer
-Sainyam Kapoor
-Nicolas Évrard
-Jason Normore
-Hoje Lee
-Adam Thalhammer
-Georg Jähnig
-Ralf Haring
-Koki Takahashi
-Ariset Llerena
-Adam Malcontenti-Wilson
-Tobias Bell
-Naglis Jonaitis
-Charles Chen
-Hassaan Ali
-Dobrosław Żybort
-David Fabijan
-Sebastian Haas
-Alexander Kirk
-Erik Johnson
-Keith Beckman
-Ole Ernst
-Aaron McDaniel (mcd1992)
-Magnus Kolstad
-Hari Padmanaban
-Carlos Ramos
-5moufl
-lenaten
-Dennis Scheiba
-Damon Timm
-winwon
-Xavier Beynon
-Gabriel Schubiner
-xantares
-Jan Matějka
-Mauroy Sébastien
-William Sewell
-Dao Hoang Son
-Oskar Jauch
-Matthew Rayfield
-t0mm0
-Tithen-Firion
-Zack Fernandes
-cryptonaut
-Adrian Kretz
-Mathias Rav
-Petr Kutalek
-Will Glynn
-Max Reimann
-Cédric Luthi
-Thijs Vermeir
-Joel Leclerc
-Christopher Krooss
-Ondřej Caletka
-Dinesh S
-Johan K. Jensen
-Yen Chi Hsuan
-Enam Mijbah Noor
-David Luhmer
-Shaya Goldberg
-Paul Hartmann
-Frans de Jonge
-Robin de Rooij
-Ryan Schmidt
-Leslie P. Polzer
-Duncan Keall
-Alexander Mamay
-Devin J. Pohly
-Eduardo Ferro Aldama
-Jeff Buchbinder
-Amish Bhadeshia
-Joram Schrijver
-Will W.
-Mohammad Teimori Pabandi
-Roman Le Négrate
-Matthias Küch
-Julian Richen
-Ping O.
-Mister Hat
-Peter Ding
-jackyzy823
-George Brighton
-Remita Amine
-Aurélio A. Heckert
-Bernhard Minks
-sceext
-Zach Bruggeman
-Tjark Saul
-slangangular
-Behrouz Abbasi
-ngld
-nyuszika7h
-Shaun Walbridge
-Lee Jenkins
-Anssi Hannula
-Lukáš Lalinský
-Qijiang Fan
-Rémy Léone
-Marco Ferragina
-reiv
-Muratcan Simsek
-Evan Lu
-flatgreen
-Brian Foley
-Vignesh Venkat
-Tom Gijselinck
-Founder Fang
-Andrew Alexeyew
-Saso Bezlaj
-Erwin de Haan
-Jens Wille
-Robin Houtevelts
-Patrick Griffis
-Aidan Rowe
-mutantmonkey
-Ben Congdon
-Kacper Michajłow
-José Joaquín Atria
-Viťas Strádal
-Kagami Hiiragi
-Philip Huppert
-blahgeek
-Kevin Deldycke
-inondle
-Tomáš Čech
-Déstin Reed
-Roman Tsiupa
-Artur Krysiak
-Jakub Adam Wieczorek
-Aleksandar Topuzović
-Nehal Patel
-Rob van Bekkum
-Petr Zvoníček
-Pratyush Singh
-Aleksander Nitecki
-Sebastian Blunt
-Matěj Cepl
-Xie Yanbo
-Philip Xu
-John Hawkinson
-Rich Leeper
-Zhong Jianxin
-Thor77
+++ /dev/null
-**Please include the full output of youtube-dl when run with `-v`**, i.e. **add** `-v` flag to **your command line**, copy the **whole** output and post it in the issue body wrapped in \`\`\` for better formatting. It should look similar to this:
-```
-$ youtube-dl -v <your command line>
-[debug] System config: []
-[debug] User config: []
-[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
-[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
-[debug] youtube-dl version 2015.12.06
-[debug] Git HEAD: 135392e
-[debug] Python version 2.6.6 - Windows-2003Server-5.2.3790-SP2
-[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
-[debug] Proxy map: {}
-...
-```
-**Do not post screenshots of verbose logs; only plain text is acceptable.**
-
-The output (including the first lines) contains important debugging information. Issues without the full output are often not reproducible and therefore do not get solved in short order, if ever.
-
-Please re-read your issue once again to avoid a couple of common mistakes (you can and should use this as a checklist):
-
-### Is the description of the issue itself sufficient?
-
-We often get issue reports that we cannot really decipher. While in most cases we eventually get the required information after asking back multiple times, this poses an unnecessary drain on our resources. Many contributors, including myself, are also not native speakers, so we may misread some parts.
-
-So please elaborate on what feature you are requesting, or what bug you want to be fixed. Make sure that it's obvious
-
-- What the problem is
-- How it could be fixed
-- How your proposed solution would look like
-
-If your report is shorter than two lines, it is almost certainly missing some of these, which makes it hard for us to respond to it. We're often too polite to close the issue outright, but the missing info makes misinterpretation likely. As a committer myself, I often get frustrated by these issues, since the only possible way for me to move forward on them is to ask for clarification over and over.
-
-For bug reports, this means that your report should contain the *complete* output of youtube-dl when called with the `-v` flag. The error message you get for (most) bugs even says so, but you would not believe how many of our bug reports do not contain this information.
-
-If your server has multiple IPs or you suspect censorship, adding `--call-home` may be a good idea to get more diagnostics. If the error is `ERROR: Unable to extract ...` and you cannot reproduce it from multiple countries, add `--dump-pages` (warning: this will yield a rather large output, redirect it to the file `log.txt` by adding `>log.txt 2>&1` to your command-line) or upload the `.dump` files you get when you add `--write-pages` [somewhere](https://gist.github.com/).
-
-**Site support requests must contain an example URL**. An example URL is a URL you might want to download, like `http://www.youtube.com/watch?v=BaW_jenozKc`. There should be an obvious video present. Except under very special circumstances, the main page of a video service (e.g. `http://www.youtube.com/`) is *not* an example URL.
-
-### Are you using the latest version?
-
-Before reporting any issue, type `youtube-dl -U`. This should report that you're up-to-date. About 20% of the reports we receive are already fixed, but people are using outdated versions. This goes for feature requests as well.
-
-### Is the issue already documented?
-
-Make sure that someone has not already opened the issue you're trying to open. Search at the top of the window or browse the [GitHub Issues](https://github.com/rg3/youtube-dl/search?type=Issues) of this repository. If there is an issue, feel free to write something along the lines of "This affects me as well, with version 2015.01.01. Here is some more information on the issue: ...". While some issues may be old, a new post into them often spurs rapid activity.
-
-### Why are existing options not enough?
-
-Before requesting a new feature, please have a quick peek at [the list of supported options](https://github.com/rg3/youtube-dl/blob/master/README.md#options). Many feature requests are for features that actually exist already! Please, absolutely do show off your work in the issue report and detail how the existing similar options do *not* solve your problem.
-
-### Is there enough context in your bug report?
-
-People want to solve problems, and often think they do us a favor by breaking down their larger problems (e.g. wanting to skip already downloaded files) to a specific request (e.g. requesting us to look whether the file exists before downloading the info page). However, what often happens is that they break down the problem into two steps: One simple, and one impossible (or extremely complicated one).
-
-We are then presented with a very complicated request when the original problem could be solved far easier, e.g. by recording the downloaded video IDs in a separate file. To avoid this, you must include the greater context where it is non-obvious. In particular, every feature request that does not consist of adding support for a new site should contain a use case scenario that explains in what situation the missing feature would be useful.
-
-### Does the issue involve one problem, and one problem only?
-
-Some of our users seem to think there is a limit of issues they can or should open. There is no limit of issues they can or should open. While it may seem appealing to be able to dump all your issues into one ticket, that means that someone who solves one of your issues cannot mark the issue as closed. Typically, reporting a bunch of issues leads to the ticket lingering since nobody wants to attack that behemoth, until someone mercifully splits the issue into multiple ones.
-
-In particular, every site support request issue should only pertain to services at one site (generally under a common domain, but always using the same backend technology). Do not request support for vimeo user videos, Whitehouse podcasts, and Google Plus pages in the same issue. Also, make sure that you don't post bug reports alongside feature requests. As a rule of thumb, a feature request does not include outputs of youtube-dl that are not immediately related to the feature at hand. Do not post reports of a network error alongside the request for a new video service.
-
-### Is anyone going to need the feature?
-
-Only post features that you (or an incapacitated friend you can personally talk to) require. Do not post features because they seem like a good idea. If they are really useful, they will be requested by someone who requires them.
-
-### Is your question about youtube-dl?
-
-It may sound strange, but some bug reports we receive are completely unrelated to youtube-dl and relate to a different, or even the reporter's own, application. Please make sure that you are actually using youtube-dl. If you are using a UI for youtube-dl, report the bug to the maintainer of the actual application providing the UI. On the other hand, if your UI for youtube-dl fails in some way you believe is related to youtube-dl, by all means, go ahead and report the bug.
-
-# DEVELOPER INSTRUCTIONS
-
-Most users do not need to build youtube-dl and can [download the builds](http://rg3.github.io/youtube-dl/download.html) or get them from their distribution.
-
-To run youtube-dl as a developer, you don't need to build anything either. Simply execute
-
- python -m youtube_dl
-
-To run the test, simply invoke your favorite test runner, or execute a test file directly; any of the following work:
-
- python -m unittest discover
- python test/test_download.py
- nosetests
-
-If you want to create a build of youtube-dl yourself, you'll need
-
-* python
-* make (only GNU make is supported)
-* pandoc
-* zip
-* nosetests
-
-### Adding support for a new site
-
-If you want to add support for a new site, first of all **make sure** this site is **not dedicated to [copyright infringement](README.md#can-you-add-support-for-this-anime-video-site-or-site-which-shows-current-movies-for-free)**. youtube-dl does **not support** such sites thus pull requests adding support for them **will be rejected**.
-
-After you have ensured this site is distributing it's content legally, you can follow this quick list (assuming your service is called `yourextractor`):
-
-1. [Fork this repository](https://github.com/rg3/youtube-dl/fork)
-2. Check out the source code with:
-
- git clone git@github.com:YOUR_GITHUB_USERNAME/youtube-dl.git
-
-3. Start a new git branch with
-
- cd youtube-dl
- git checkout -b yourextractor
-
-4. Start with this simple template and save it to `youtube_dl/extractor/yourextractor.py`:
-
- ```python
- # coding: utf-8
- from __future__ import unicode_literals
-
- from .common import InfoExtractor
-
-
- class YourExtractorIE(InfoExtractor):
- _VALID_URL = r'https?://(?:www\.)?yourextractor\.com/watch/(?P<id>[0-9]+)'
- _TEST = {
- 'url': 'http://yourextractor.com/watch/42',
- 'md5': 'TODO: md5 sum of the first 10241 bytes of the video file (use --test)',
- 'info_dict': {
- 'id': '42',
- 'ext': 'mp4',
- 'title': 'Video title goes here',
- 'thumbnail': 're:^https?://.*\.jpg$',
- # TODO more properties, either as:
- # * A value
- # * MD5 checksum; start the string with md5:
- # * A regular expression; start the string with re:
- # * Any Python type (for example int or float)
- }
- }
-
- def _real_extract(self, url):
- video_id = self._match_id(url)
- webpage = self._download_webpage(url, video_id)
-
- # TODO more code goes here, for example ...
- title = self._html_search_regex(r'<h1>(.+?)</h1>', webpage, 'title')
-
- return {
- 'id': video_id,
- 'title': title,
- 'description': self._og_search_description(webpage),
- 'uploader': self._search_regex(r'<div[^>]+id="uploader"[^>]*>([^<]+)<', webpage, 'uploader', fatal=False),
- # TODO more properties (see youtube_dl/extractor/common.py)
- }
- ```
-5. Add an import in [`youtube_dl/extractor/extractors.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/extractors.py).
-6. Run `python test/test_download.py TestDownload.test_YourExtractor`. This *should fail* at first, but you can continually re-run it until you're done. If you decide to add more than one test, then rename ``_TEST`` to ``_TESTS`` and make it into a list of dictionaries. The tests will then be named `TestDownload.test_YourExtractor`, `TestDownload.test_YourExtractor_1`, `TestDownload.test_YourExtractor_2`, etc.
-7. Have a look at [`youtube_dl/extractor/common.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py) for possible helper methods and a [detailed description of what your extractor should and may return](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py#L74-L252). Add tests and code for as many as you want.
-8. Make sure your code follows [youtube-dl coding conventions](#youtube-dl-coding-conventions) and check the code with [flake8](https://pypi.python.org/pypi/flake8). Also make sure your code works under all [Python](http://www.python.org/) versions claimed supported by youtube-dl, namely 2.6, 2.7, and 3.2+.
-9. When the tests pass, [add](http://git-scm.com/docs/git-add) the new files and [commit](http://git-scm.com/docs/git-commit) them and [push](http://git-scm.com/docs/git-push) the result, like this:
-
- $ git add youtube_dl/extractor/extractors.py
- $ git add youtube_dl/extractor/yourextractor.py
- $ git commit -m '[yourextractor] Add new extractor'
- $ git push origin yourextractor
-
-10. Finally, [create a pull request](https://help.github.com/articles/creating-a-pull-request). We'll then review and merge it.
-
-In any case, thank you very much for your contributions!
-
-## youtube-dl coding conventions
-
-This section introduces a guide lines for writing idiomatic, robust and future-proof extractor code.
-
-Extractors are very fragile by nature since they depend on the layout of the source data provided by 3rd party media hosters out of your control and this layout tends to change. As an extractor implementer your task is not only to write code that will extract media links and metadata correctly but also to minimize dependency on the source's layout and even to make the code foresee potential future changes and be ready for that. This is important because it will allow the extractor not to break on minor layout changes thus keeping old youtube-dl versions working. Even though this breakage issue is easily fixed by emitting a new version of youtube-dl with a fix incorporated, all the previous versions become broken in all repositories and distros' packages that may not be so prompt in fetching the update from us. Needless to say, some non rolling release distros may never receive an update at all.
-
-### Mandatory and optional metafields
-
-For extraction to work youtube-dl relies on metadata your extractor extracts and provides to youtube-dl expressed by an [information dictionary](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py#L75-L257) or simply *info dict*. Only the following meta fields in the *info dict* are considered mandatory for a successful extraction process by youtube-dl:
-
- - `id` (media identifier)
- - `title` (media title)
- - `url` (media download URL) or `formats`
-
-In fact only the last option is technically mandatory (i.e. if you can't figure out the download location of the media the extraction does not make any sense). But by convention youtube-dl also treats `id` and `title` as mandatory. Thus the aforementioned metafields are the critical data that the extraction does not make any sense without and if any of them fail to be extracted then the extractor is considered completely broken.
-
-[Any field](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py#L149-L257) apart from the aforementioned ones are considered **optional**. That means that extraction should be **tolerant** to situations when sources for these fields can potentially be unavailable (even if they are always available at the moment) and **future-proof** in order not to break the extraction of general purpose mandatory fields.
-
-#### Example
-
-Say you have some source dictionary `meta` that you've fetched as JSON with HTTP request and it has a key `summary`:
-
-```python
-meta = self._download_json(url, video_id)
-```
-
-Assume at this point `meta`'s layout is:
-
-```python
-{
- ...
- "summary": "some fancy summary text",
- ...
-}
-```
-
-Assume you want to extract `summary` and put it into the resulting info dict as `description`. Since `description` is an optional metafield you should be ready that this key may be missing from the `meta` dict, so that you should extract it like:
-
-```python
-description = meta.get('summary') # correct
-```
-
-and not like:
-
-```python
-description = meta['summary'] # incorrect
-```
-
-The latter will break extraction process with `KeyError` if `summary` disappears from `meta` at some later time but with the former approach extraction will just go ahead with `description` set to `None` which is perfectly fine (remember `None` is equivalent to the absence of data).
-
-Similarly, you should pass `fatal=False` when extracting optional data from a webpage with `_search_regex`, `_html_search_regex` or similar methods, for instance:
-
-```python
-description = self._search_regex(
- r'<span[^>]+id="title"[^>]*>([^<]+)<',
- webpage, 'description', fatal=False)
-```
-
-With `fatal` set to `False` if `_search_regex` fails to extract `description` it will emit a warning and continue extraction.
-
-You can also pass `default=<some fallback value>`, for example:
-
-```python
-description = self._search_regex(
- r'<span[^>]+id="title"[^>]*>([^<]+)<',
- webpage, 'description', default=None)
-```
-
-On failure this code will silently continue the extraction with `description` set to `None`. That is useful for metafields that may or may not be present.
-
-### Provide fallbacks
-
-When extracting metadata try to do so from multiple sources. For example if `title` is present in several places, try extracting from at least some of them. This makes it more future-proof in case some of the sources become unavailable.
-
-#### Example
-
-Say `meta` from the previous example has a `title` and you are about to extract it. Since `title` is a mandatory meta field you should end up with something like:
-
-```python
-title = meta['title']
-```
-
-If `title` disappears from `meta` in future due to some changes on the hoster's side the extraction would fail since `title` is mandatory. That's expected.
-
-Assume that you have some another source you can extract `title` from, for example `og:title` HTML meta of a `webpage`. In this case you can provide a fallback scenario:
-
-```python
-title = meta.get('title') or self._og_search_title(webpage)
-```
-
-This code will try to extract from `meta` first and if it fails it will try extracting `og:title` from a `webpage`.
-
-### Make regular expressions flexible
-
-When using regular expressions try to write them fuzzy and flexible.
-
-#### Example
-
-Say you need to extract `title` from the following HTML code:
-
-```html
-<span style="position: absolute; left: 910px; width: 90px; float: right; z-index: 9999;" class="title">some fancy title</span>
-```
-
-The code for that task should look similar to:
-
-```python
-title = self._search_regex(
- r'<span[^>]+class="title"[^>]*>([^<]+)', webpage, 'title')
-```
-
-Or even better:
-
-```python
-title = self._search_regex(
- r'<span[^>]+class=(["\'])title\1[^>]*>(?P<title>[^<]+)',
- webpage, 'title', group='title')
-```
-
-Note how you tolerate potential changes in the `style` attribute's value or switch from using double quotes to single for `class` attribute:
-
-The code definitely should not look like:
-
-```python
-title = self._search_regex(
- r'<span style="position: absolute; left: 910px; width: 90px; float: right; z-index: 9999;" class="title">(.*?)</span>',
- webpage, 'title', group='title')
-```
-
-### Use safe conversion functions
-
-Wrap all extracted numeric data into safe functions from `utils`: `int_or_none`, `float_or_none`. Use them for string to number conversions as well.
-
+version 2017.02.07
+
+Core
+* [extractor/common] Fix audio only with audio group in m3u8 (#11995)
++ [downloader/fragment] Respect --no-part
+* [extractor/common] Speed-up HTML5 media entries extraction (#11979)
+
+Extractors
+* [pornhub] Fix extraction (#11997)
++ [canalplus] Add support for cstar.fr (#11990)
++ [extractor/generic] Improve RTMP support (#11993)
++ [gaskrank] Add support for gaskrank.tv (#11685)
+* [bandcamp] Fix extraction for incomplete albums (#11727)
+* [iwara] Fix extraction (#11781)
+* [googledrive] Fix extraction on Python 3.6
++ [videopress] Add support for videopress.com
++ [afreecatv] Extract RTMP formats
+
+
+version 2017.02.04.1
+
+Extractors
++ [twitch:stream] Add support for player.twitch.tv (#11971)
+* [radiocanada] Fix extraction for toutv rtmp formats
+
+
+version 2017.02.04
+
+Core
++ Add --playlist-random to shuffle playlists (#11889, #11901)
+* [utils] Improve comments processing in js_to_json (#11947)
+* [utils] Handle single-line comments in js_to_json
+* [downloader/external:ffmpeg] Minimize the use of aac_adtstoasc filter
+
+Extractors
++ [piksel] Add another app token pattern (#11969)
++ [vk] Capture and output author blocked error message (#11965)
++ [turner] Fix secure HLS formats downloading with ffmpeg (#11358, #11373,
+ #11800)
++ [drtv] Add support for live and radio sections (#1827, #3427)
+* [myspace] Fix extraction and extract HLS and HTTP formats
++ [youtube] Add format info for itag 325 and 328
+* [vine] Fix extraction (#11955)
+- [sportbox] Remove extractor (#11954)
++ [filmon] Add support for filmon.com (#11187)
++ [infoq] Add audio only formats (#11565)
+* [douyutv] Improve room id regular expression (#11931)
+* [iprima] Fix extraction (#11920, #11896)
+* [youtube] Fix ytsearch when cookies are provided (#11924)
+* [go] Relax video id regular expression (#11937)
+* [facebook] Fix title extraction (#11941)
++ [youtube:playlist] Recognize TL playlists (#11945)
++ [bilibili] Support new Bangumi URLs (#11845)
++ [cbc:watch] Extract audio codec for audio only formats (#11893)
++ [elpais] Fix extraction for some URLs (#11765)
+
+
+version 2017.02.01
+
+Extractors
++ [facebook] Add another fallback extraction scenario (#11926)
+* [prosiebensat1] Fix extraction of descriptions (#11810, #11929)
+- [crunchyroll] Remove ScaledBorderAndShadow settings (#9028)
++ [vimeo] Extract upload timestamp
++ [vimeo] Extract license (#8726, #11880)
++ [nrk:series] Add support for series (#11571, #11711)
+
+
+version 2017.01.31
+
+Core
++ [compat] Add compat_etree_register_namespace
+
+Extractors
+* [youtube] Fix extraction for domainless player URLs (#11890, #11891, #11892,
+ #11894, #11895, #11897, #11900, #11903, #11904, #11906, #11907, #11909,
+ #11913, #11914, #11915, #11916, #11917, #11918, #11919)
++ [vimeo] Extract both mixed and separated DASH formats
++ [ruutu] Extract DASH formats
+* [itv] Fix extraction for python 2.6
+
+
+version 2017.01.29
+
+Core
+* [extractor/common] Fix initialization template (#11605, #11825)
++ [extractor/common] Document fragment_base_url and fragment's path fields
+* [extractor/common] Fix duration per DASH segment (#11868)
++ Introduce --autonumber-start option for initial value of %(autonumber)s
+ template (#727, #2702, #9362, #10457, #10529, #11862)
+
+Extractors
++ [azmedien:playlist] Add support for topic and themen playlists (#11817)
+* [npo] Fix subtitles extraction
++ [itv] Extract subtitles
++ [itv] Add support for itv.com (#9240)
++ [mtv81] Add support for mtv81.com (#7619)
++ [vlive] Add support for channels (#11826)
++ [kaltura] Add fallback for fileExt
++ [kaltura] Improve uploader_id extraction
++ [konserthusetplay] Add support for rspoplay.se (#11828)
+
+
+version 2017.01.28
+
+Core
+* [utils] Improve parse_duration
+
+Extractors
+* [crunchyroll] Improve series and season metadata extraction (#11832)
+* [soundcloud] Improve formats extraction and extract audio bitrate
++ [soundcloud] Extract HLS formats
+* [soundcloud] Fix track URL extraction (#11852)
++ [twitch:vod] Expand URL regular expressions (#11846)
+* [aenetworks] Fix season episodes extraction (#11669)
++ [tva] Add support for videos.tva.ca (#11842)
+* [jamendo] Improve and extract more metadata (#11836)
++ [disney] Add support for Disney sites (#7409, #11801, #4975, #11000)
+* [vevo] Remove request to old API and catch API v2 errors
++ [cmt,mtv,southpark] Add support for episode URLs (#11837)
++ [youtube] Add fallback for duration extraction (#11841)
+
+
+version 2017.01.25
+
+Extractors
++ [openload] Fallback video extension to mp4
++ [extractor/generic] Add support for Openload embeds (#11536, #11812)
+* [srgssr] Fix rts video extraction (#11831)
++ [afreecatv:global] Add support for afreeca.tv (#11807)
++ [crackle] Extract vtt subtitles
++ [crackle] Extract multiple resolutions for thumbnails
++ [crackle] Add support for mobile URLs
++ [konserthusetplay] Extract subtitles (#11823)
++ [konserthusetplay] Add support for HLS videos (#11823)
+* [vimeo:review] Fix config URL extraction (#11821)
+
+
+version 2017.01.24
+
+Extractors
+* [pluralsight] Fix extraction (#11820)
++ [nextmedia] Add support for NextTV (壹電視)
+* [24video] Fix extraction (#11811)
+* [youtube:playlist] Fix nonexistent and private playlist detection (#11604)
++ [chirbit] Extract uploader (#11809)
+
+
+version 2017.01.22
+
+Extractors
++ [pornflip] Add support for pornflip.com (#11556, #11795)
+* [chaturbate] Fix extraction (#11797, #11802)
++ [azmedien] Add support for AZ Medien sites (#11784, #11785)
++ [nextmedia] Support redirected URLs
++ [vimeo:channel] Extract videos' titles for playlist entries (#11796)
++ [youtube] Extract episode metadata (#9695, #11774)
++ [cspan] Support Ustream embedded videos (#11547)
++ [1tv] Add support for HLS videos (#11786)
+* [uol] Fix extraction (#11770)
+* [mtv] Relax triforce feed regular expression (#11766)
+
+
+version 2017.01.18
+
+Extractors
+* [bilibili] Fix extraction (#11077)
++ [canalplus] Add fallback for video id (#11764)
+* [20min] Fix extraction (#11683, #11751)
+* [imdb] Extend URL regular expression (#11744)
++ [naver] Add support for tv.naver.com links (#11743)
+
+
+version 2017.01.16
+
+Core
+* [options] Apply custom config to final composite configuration (#11741)
+* [YoutubeDL] Improve protocol auto determining (#11720)
+
+Extractors
+* [xiami] Relax URL regular expressions
+* [xiami] Improve track metadata extraction (#11699)
++ [limelight] Check hand-make direct HTTP links
++ [limelight] Add support for direct HTTP links at video.llnw.net (#11737)
++ [brightcove] Recognize another player ID pattern (#11688)
++ [niconico] Support login via cookies (#7968)
+* [yourupload] Fix extraction (#11601)
++ [beam:live] Add support for beam.pro live streams (#10702, #11596)
+* [vevo] Improve geo restriction detection
++ [dramafever] Add support for URLs with language code (#11714)
+* [cbc] Improve playlist support (#11704)
+
+
+version 2017.01.14
+
+Core
++ [common] Add ability to customize akamai manifest host
++ [utils] Add more date formats
+
+Extractors
+- [mtv] Eliminate _transform_rtmp_url
+* [mtv] Generalize triforce mgid extraction
++ [cmt] Add support for full episodes and video clips (#11623)
++ [mitele] Extract DASH formats
++ [ooyala] Add support for videos with embedToken (#11684)
+* [mixcloud] Fix extraction (#11674)
+* [openload] Fix extraction (#10408)
+* [tv4] Improve extraction (#11698)
+* [freesound] Fix and improve extraction (#11602)
++ [nick] Add support for beta.nick.com (#11655)
+* [mtv,cc] Use HLS by default with native HLS downloader (#11641)
+* [mtv] Fix non-HLS extraction
+
+
+version 2017.01.10
+
+Extractors
+* [youtube] Fix extraction (#11663, #11664)
++ [inc] Add support for inc.com (#11277, #11647)
++ [youtube] Add itag 212 (#11575)
++ [egghead:course] Add support for egghead.io courses
+
+
+version 2017.01.08
+
+Core
+* Fix "invalid escape sequence" errors under Python 3.6 (#11581)
+
+Extractors
++ [hitrecord] Add support for hitrecord.org (#10867, #11626)
+- [videott] Remove extractor
+* [swrmediathek] Improve extraction
+- [sharesix] Remove extractor
+- [aol:features] Remove extractor
+* [sendtonews] Improve info extraction
+* [3sat,phoenix] Fix extraction (#11619)
+* [comedycentral/mtv] Add support for HLS videos (#11600)
+* [discoverygo] Fix JSON data parsing (#11219, #11522)
+
+
+version 2017.01.05
+
+Extractors
++ [zdf] Fix extraction (#11055, #11063)
+* [pornhub:playlist] Improve extraction (#11594)
++ [cctv] Add support for ncpa-classic.com (#11591)
++ [tunein] Add support for embeds (#11579)
+
+
+version 2017.01.02
+
+Extractors
+* [cctv] Improve extraction (#879, #6753, #8541)
++ [nrktv:episodes] Add support for episodes (#11571)
++ [arkena] Add support for video.arkena.com (#11568)
+
+
+version 2016.12.31
+
+Core
++ Introduce --config-location option for custom configuration files (#6745,
+ #10648)
+
+Extractors
++ [twitch] Add support for player.twitch.tv (#11535, #11537)
++ [videa] Add support for videa.hu (#8181, #11133)
+* [vk] Fix postlive videos extraction
+* [vk] Extract from playerParams (#11555)
+- [freevideo] Remove extractor (#11515)
++ [showroomlive] Add support for showroom-live.com (#11458)
+* [xhamster] Fix duration extraction (#11549)
+* [rtve:live] Fix extraction (#11529)
+* [brightcove:legacy] Improve embeds detection (#11523)
++ [twitch] Add support for rechat messages (#11524)
+* [acast] Fix audio and timestamp extraction (#11521)
+
+
+version 2016.12.22
+
+Core
+* [extractor/common] Improve detection of video-only formats in m3u8
+ manifests (#11507)
+
+Extractors
++ [theplatform] Pass geo verification headers to SMIL request (#10146)
++ [viu] Pass geo verification headers to auth request
+* [rtl2] Extract more formats and metadata
+* [vbox7] Skip malformed JSON-LD (#11501)
+* [uplynk] Force downloading using native HLS downloader (#11496)
++ [laola1] Add support for another extraction scenario (#11460)
+
+
+version 2016.12.20
+
+Core
+* [extractor/common] Improve fragment URL construction for DASH media
+* [extractor/common] Fix codec information extraction for mixed audio/video
+ DASH media (#11490)
+
+Extractors
+* [vbox7] Fix extraction (#11494)
++ [uktvplay] Add support for uktvplay.uktv.co.uk (#11027)
++ [piksel] Add support for player.piksel.com (#11246)
++ [vimeo] Add support for DASH formats
+* [vimeo] Fix extraction for HLS formats (#11490)
+* [kaltura] Fix wrong widget ID in some cases (#11480)
++ [nrktv:direkte] Add support for live streams (#11488)
+* [pbs] Fix extraction for geo restricted videos (#7095)
+* [brightcove:new] Skip widevine classic videos
++ [viu] Add support for viu.com (#10607, #11329)
+
+
+version 2016.12.18
+
+Core
++ [extractor/common] Recognize DASH formats in html5 media entries
+
+Extractors
++ [ccma] Add support for ccma.cat (#11359)
+* [laola1tv] Improve extraction
++ [laola1tv] Add support embed URLs (#11460)
+* [nbc] Fix extraction for MSNBC videos (#11466)
+* [twitch] Adapt to new videos pages URL schema (#11469)
++ [meipai] Add support for meipai.com (#10718)
+* [jwplatform] Improve subtitles and duration extraction
++ [ondemandkorea] Add support for ondemandkorea.com (#10772)
++ [vvvvid] Add support for vvvvid.it (#5915)
+
+
+version 2016.12.15
+
+Core
++ [utils] Add convenience urljoin
+
+Extractors
++ [openload] Recognize oload.tv URLs (#10408)
++ [facebook] Recognize .onion URLs (#11443)
+* [vlive] Fix extraction (#11375, #11383)
++ [canvas] Extract DASH formats
++ [melonvod] Add support for vod.melon.com (#11419)
+
+
+version 2016.12.12
+
+Core
++ [utils] Add common user agents map
++ [common] Recognize HLS manifests that contain video only formats (#11394)
+
+Extractors
++ [dplay] Use Safari user agent for HLS (#11418)
++ [facebook] Detect login required error message
+* [facebook] Improve video selection (#11390)
++ [canalplus] Add another video id pattern (#11399)
+* [mixcloud] Relax URL regular expression (#11406)
+* [ctvnews] Relax URL regular expression (#11394)
++ [rte] Capture and output error message (#7746, #10498)
++ [prosiebensat1] Add support for DASH formats
+* [srgssr] Improve extraction for geo restricted videos (#11089)
+* [rts] Improve extraction for geo restricted videos (#4989)
+
+
+version 2016.12.09
+
+Core
+* [socks] Fix error reporting (#11355)
+
+Extractors
+* [openload] Fix extraction (#10408)
+* [pandoratv] Fix extraction (#11023)
++ [telebruxelles] Add support for emission URLs
+* [telebruxelles] Extract all formats
++ [bloomberg] Add another video id regular expression (#11371)
+* [fusion] Update ooyala id regular expression (#11364)
++ [1tv] Add support for playlists (#11335)
+* [1tv] Improve extraction (#11335)
++ [aenetworks] Extract more formats (#11321)
++ [thisoldhouse] Recognize /tv-episode/ URLs (#11271)
+
+
version 2016.12.01
Extractors
You can also use pip:
- sudo pip install --upgrade youtube-dl
+ sudo -H pip install --upgrade youtube-dl
This command will update youtube-dl if you have already installed it. See the [pypi page](https://pypi.python.org/pypi/youtube_dl) for more information.
Alternatively, refer to the [developer instructions](#developer-instructions) for how to check out and work with the git repository. For further options, including PGP signatures, see the [youtube-dl Download Page](https://rg3.github.io/youtube-dl/download.html).
# DESCRIPTION
-**youtube-dl** is a command-line program to download videos from
-YouTube.com and a few more sites. It requires the Python interpreter, version
-2.6, 2.7, or 3.2+, and it is not platform specific. It should work on
-your Unix box, on Windows or on Mac OS X. It is released to the public domain,
-which means you can modify it, redistribute it or use it however you like.
+**youtube-dl** is a command-line program to download videos from YouTube.com and a few more sites. It requires the Python interpreter, version 2.6, 2.7, or 3.2+, and it is not platform specific. It should work on your Unix box, on Windows or on Mac OS X. It is released to the public domain, which means you can modify it, redistribute it or use it however you like.
youtube-dl [OPTIONS] URL [URL...]
configuration in ~/.config/youtube-
dl/config (%APPDATA%/youtube-dl/config.txt
on Windows)
+ --config-location PATH Location of the configuration file; either
+ the path to the config or its containing
+ directory.
--flat-playlist Do not extract the videos of a playlist,
only list them.
--mark-watched Mark videos watched (YouTube only)
--no-mark-watched Do not mark videos watched (YouTube only)
--no-color Do not emit color codes in output
- --abort-on-unavailable-fragment Abort downloading when some fragment is not
- available
## Network Options:
--proxy URL Use the specified HTTP/HTTPS/SOCKS proxy.
string (--proxy "") for direct connection
--socket-timeout SECONDS Time to wait before giving up, in seconds
--source-address IP Client-side IP address to bind to
- (experimental)
-4, --force-ipv4 Make all connections via IPv4
- (experimental)
-6, --force-ipv6 Make all connections via IPv6
- (experimental)
--geo-verification-proxy URL Use this proxy to verify the IP address for
some geo-restricted sites. The default
proxy specified by --proxy (or none, if the
options is not present) is used for the
- actual downloading. (experimental)
+ actual downloading.
## Video Selection:
--playlist-start NUMBER Playlist video to start at (default is 1)
COUNT views
--max-views COUNT Do not download any videos with more than
COUNT views
- --match-filter FILTER Generic video filter (experimental).
- Specify any key (see help for -o for a list
- of available keys) to match if the key is
- present, !key to check if the key is not
- present,key > NUMBER (like "comment_count >
- 12", also works with >=, <, <=, !=, =) to
- compare against a number, and & to require
- multiple matches. Values which are not
- known are excluded unless you put a
- question mark (?) after the operator.For
- example, to only match videos that have
- been liked more than 100 times and disliked
- less than 50 times (or the dislike
- functionality is not available at the given
- service), but who also have a description,
- use --match-filter "like_count > 100 &
- dislike_count <? 50 & description" .
+ --match-filter FILTER Generic video filter. Specify any key (see
+ help for -o for a list of available keys)
+ to match if the key is present, !key to
+ check if the key is not present,key >
+ NUMBER (like "comment_count > 12", also
+ works with >=, <, <=, !=, =) to compare
+ against a number, and & to require multiple
+ matches. Values which are not known are
+ excluded unless you put a question mark (?)
+ after the operator.For example, to only
+ match videos that have been liked more than
+ 100 times and disliked less than 50 times
+ (or the dislike functionality is not
+ available at the given service), but who
+ also have a description, use --match-filter
+ "like_count > 100 & dislike_count <? 50 &
+ description" .
--no-playlist Download only the video, if the URL refers
to a video and a playlist.
--yes-playlist Download the playlist, if the URL refers to
only)
--skip-unavailable-fragments Skip unavailable fragments (DASH and
hlsnative only)
+ --abort-on-unavailable-fragment Abort downloading when some fragment is not
+ available
--buffer-size SIZE Size of download buffer (e.g. 1024 or 16K)
(default is 1024)
--no-resize-buffer Do not automatically adjust the buffer
automatically resized from an initial value
of SIZE.
--playlist-reverse Download playlist videos in reverse order
+ --playlist-random Download playlist videos in random order
--xattr-set-filesize Set file xattribute ytdl.filesize with
- expected filesize (experimental)
+ expected file size (experimental)
--hls-prefer-native Use the native HLS downloader instead of
ffmpeg
--hls-prefer-ffmpeg Use ffmpeg instead of the native HLS
--autonumber-size NUMBER Specify the number of digits in
%(autonumber)s when it is present in output
filename template or --auto-number option
- is given
+ is given (default is 5)
+ --autonumber-start NUMBER Specify the start value for %(autonumber)s
+ (default is 1)
--restrict-filenames Restrict filenames to only ASCII
characters, and avoid "&" and spaces in
filenames
-u, --username USERNAME Login with this account ID
-p, --password PASSWORD Account password. If this option is left
out, youtube-dl will ask interactively.
- -2, --twofactor TWOFACTOR Two-factor auth code
+ -2, --twofactor TWOFACTOR Two-factor authentication code
-n, --netrc Use .netrc authentication data
--video-password PASSWORD Video password (vimeo, smotri, youku)
avprobe)
--audio-format FORMAT Specify audio format: "best", "aac",
"vorbis", "mp3", "m4a", "opus", or "wav";
- "best" by default
+ "best" by default; No effect without -x
--audio-quality QUALITY Specify ffmpeg/avconv audio quality, insert
a value between 0 (better) and 9 (worse)
for VBR or a specific bitrate like 128K
You can use `--ignore-config` if you want to disable the configuration file for a particular youtube-dl run.
+You can also use `--config-location` if you want to use custom configuration file for a particular youtube-dl run.
+
### Authentication with `.netrc` file
You may also want to configure automatic credentials storage for extractors that support authentication (by providing login and password with `--username` and `--password`) in order not to pass credentials as command line arguments on every youtube-dl execution and prevent tracking plain text passwords in the shell command history. You can achieve this using a [`.netrc` file](http://stackoverflow.com/tags/.netrc/info) on a per extractor basis. For that you will need to create a `.netrc` file in your `$HOME` and restrict permissions to read/write by only you:
- `acodec`: Name of the audio codec in use
- `vcodec`: Name of the video codec in use
- `container`: Name of the container format
- - `protocol`: The protocol that will be used for the actual download, lower-case. `http`, `https`, `rtsp`, `rtmp`, `rtmpe`, `m3u8`, or `m3u8_native`
+ - `protocol`: The protocol that will be used for the actual download, lower-case (`http`, `https`, `rtsp`, `rtmp`, `rtmpe`, `mms`, `f4m`, `ism`, `m3u8`, or `m3u8_native`)
- `format_id`: A short description of the format
Note that none of the aforementioned meta fields are guaranteed to be present since this solely depends on the metadata obtained by particular extractor, i.e. the metadata offered by the video hoster.
# Download best format available but not better that 480p
$ youtube-dl -f 'bestvideo[height<=480]+bestaudio/best[height<=480]'
-# Download best video only format but no bigger that 50 MB
+# Download best video only format but no bigger than 50 MB
$ youtube-dl -f 'best[filesize<50M]'
# Download best format available via direct link over HTTP/HTTPS protocol
### I get HTTP error 402 when trying to download a video. What's this?
-Apparently YouTube requires you to pass a CAPTCHA test if you download too much. We're [considering to provide a way to let you solve the CAPTCHA](https://github.com/rg3/youtube-dl/issues/154), but at the moment, your best course of action is pointing a webbrowser to the youtube URL, solving the CAPTCHA, and restart youtube-dl.
+Apparently YouTube requires you to pass a CAPTCHA test if you download too much. We're [considering to provide a way to let you solve the CAPTCHA](https://github.com/rg3/youtube-dl/issues/154), but at the moment, your best course of action is pointing a web browser to the youtube URL, solving the CAPTCHA, and restart youtube-dl.
### Do I need any other programs?
Once the video is fully downloaded, use any video player, such as [mpv](https://mpv.io/), [vlc](http://www.videolan.org/) or [mplayer](http://www.mplayerhq.hu/).
-### I extracted a video URL with `-g`, but it does not play on another machine / in my webbrowser.
+### I extracted a video URL with `-g`, but it does not play on another machine / in my web browser.
It depends a lot on the service. In many cases, requests for the video (to download/play it) must come from the same IP address and with the same cookies and/or HTTP headers. Use the `--cookies` option to write the required cookies into a file, and advise your downloader to read cookies from that file. Some sites also require a common user agent to be used, use `--dump-user-agent` to see the one in use by youtube-dl. You can also get necessary cookies and HTTP headers from JSON output obtained with `--dump-json`.
In order to extract cookies from browser use any conforming browser extension for exporting cookies. For example, [cookies.txt](https://chrome.google.com/webstore/detail/cookiestxt/njabckikapfpffapmjgojcnbfjonfjfg) (for Chrome) or [Export Cookies](https://addons.mozilla.org/en-US/firefox/addon/export-cookies/) (for Firefox).
-Note that the cookies file must be in Mozilla/Netscape format and the first line of the cookies file must be either `# HTTP Cookie File` or `# Netscape HTTP Cookie File`. Make sure you have correct [newline format](https://en.wikipedia.org/wiki/Newline) in the cookies file and convert newlines if necessary to correspond with your OS, namely `CRLF` (`\r\n`) for Windows, `LF` (`\n`) for Linux and `CR` (`\r`) for Mac OS. `HTTP Error 400: Bad Request` when using `--cookies` is a good sign of invalid newline format.
+Note that the cookies file must be in Mozilla/Netscape format and the first line of the cookies file must be either `# HTTP Cookie File` or `# Netscape HTTP Cookie File`. Make sure you have correct [newline format](https://en.wikipedia.org/wiki/Newline) in the cookies file and convert newlines if necessary to correspond with your OS, namely `CRLF` (`\r\n`) for Windows and `LF` (`\n`) for Unix and Unix-like systems (Linux, Mac OS, etc.). `HTTP Error 400: Bad Request` when using `--cookies` is a good sign of invalid newline format.
Passing cookies to youtube-dl is a good way to workaround login when a particular extractor does not implement it explicitly. Another use case is working around [CAPTCHA](https://en.wikipedia.org/wiki/CAPTCHA) some websites require you to solve in particular cases in order to get access (e.g. YouTube, CloudFlare).
If you want to add support for a new site, first of all **make sure** this site is **not dedicated to [copyright infringement](README.md#can-you-add-support-for-this-anime-video-site-or-site-which-shows-current-movies-for-free)**. youtube-dl does **not support** such sites thus pull requests adding support for them **will be rejected**.
-After you have ensured this site is distributing it's content legally, you can follow this quick list (assuming your service is called `yourextractor`):
+After you have ensured this site is distributing its content legally, you can follow this quick list (assuming your service is called `yourextractor`):
1. [Fork this repository](https://github.com/rg3/youtube-dl/fork)
2. Check out the source code with:
'id': '42',
'ext': 'mp4',
'title': 'Video title goes here',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
# TODO more properties, either as:
# * A value
# * MD5 checksum; start the string with md5:
}
```
-Assume you want to extract `summary` and put it into the resulting info dict as `description`. Since `description` is an optional metafield you should be ready that this key may be missing from the `meta` dict, so that you should extract it like:
+Assume you want to extract `summary` and put it into the resulting info dict as `description`. Since `description` is an optional meta field you should be ready that this key may be missing from the `meta` dict, so that you should extract it like:
```python
description = meta.get('summary') # correct
ydl.download(['http://www.youtube.com/watch?v=BaW_jenozKc'])
```
-Most likely, you'll want to use various options. For a list of options available, have a look at [`youtube_dl/YoutubeDL.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/YoutubeDL.py#L128-L278). For a start, if you want to intercept youtube-dl's output, set a `logger` object.
+Most likely, you'll want to use various options. For a list of options available, have a look at [`youtube_dl/YoutubeDL.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/YoutubeDL.py#L129-L279). For a start, if you want to intercept youtube-dl's output, set a `logger` object.
Here's a more complete example of a program that outputs only errors (and a short message after the download is finished), and downloads/converts the video to an mp3 file:
Some of our users seem to think there is a limit of issues they can or should open. There is no limit of issues they can or should open. While it may seem appealing to be able to dump all your issues into one ticket, that means that someone who solves one of your issues cannot mark the issue as closed. Typically, reporting a bunch of issues leads to the ticket lingering since nobody wants to attack that behemoth, until someone mercifully splits the issue into multiple ones.
-In particular, every site support request issue should only pertain to services at one site (generally under a common domain, but always using the same backend technology). Do not request support for vimeo user videos, Whitehouse podcasts, and Google Plus pages in the same issue. Also, make sure that you don't post bug reports alongside feature requests. As a rule of thumb, a feature request does not include outputs of youtube-dl that are not immediately related to the feature at hand. Do not post reports of a network error alongside the request for a new video service.
+In particular, every site support request issue should only pertain to services at one site (generally under a common domain, but always using the same backend technology). Do not request support for vimeo user videos, White house podcasts, and Google Plus pages in the same issue. Also, make sure that you don't post bug reports alongside feature requests. As a rule of thumb, a feature request does not include outputs of youtube-dl that are not immediately related to the feature at hand. Do not post reports of a network error alongside the request for a new video service.
### Is anyone going to need the feature?
--- /dev/null
+youtube-dl - download videos from youtube.com or other video platforms
+
+- INSTALLATION
+- DESCRIPTION
+- OPTIONS
+- CONFIGURATION
+- OUTPUT TEMPLATE
+- FORMAT SELECTION
+- VIDEO SELECTION
+- FAQ
+- DEVELOPER INSTRUCTIONS
+- EMBEDDING YOUTUBE-DL
+- BUGS
+- COPYRIGHT
+
+
+
+INSTALLATION
+
+
+To install it right away for all UNIX users (Linux, OS X, etc.), type:
+
+ sudo curl -L https://yt-dl.org/downloads/latest/youtube-dl -o /usr/local/bin/youtube-dl
+ sudo chmod a+rx /usr/local/bin/youtube-dl
+
+If you do not have curl, you can alternatively use a recent wget:
+
+ sudo wget https://yt-dl.org/downloads/latest/youtube-dl -O /usr/local/bin/youtube-dl
+ sudo chmod a+rx /usr/local/bin/youtube-dl
+
+Windows users can download an .exe file and place it in any location on
+their PATH except for %SYSTEMROOT%\System32 (e.g. DO NOT put in
+C:\Windows\System32).
+
+You can also use pip:
+
+ sudo -H pip install --upgrade youtube-dl
+
+This command will update youtube-dl if you have already installed it.
+See the pypi page for more information.
+
+OS X users can install youtube-dl with Homebrew:
+
+ brew install youtube-dl
+
+Or with MacPorts:
+
+ sudo port install youtube-dl
+
+Alternatively, refer to the developer instructions for how to check out
+and work with the git repository. For further options, including PGP
+signatures, see the youtube-dl Download Page.
+
+
+
+DESCRIPTION
+
+
+YOUTUBE-DL is a command-line program to download videos from YouTube.com
+and a few more sites. It requires the Python interpreter, version 2.6,
+2.7, or 3.2+, and it is not platform specific. It should work on your
+Unix box, on Windows or on Mac OS X. It is released to the public
+domain, which means you can modify it, redistribute it or use it however
+you like.
+
+ youtube-dl [OPTIONS] URL [URL...]
+
+
+
+OPTIONS
+
+
+ -h, --help Print this help text and exit
+ --version Print program version and exit
+ -U, --update Update this program to latest version. Make
+ sure that you have sufficient permissions
+ (run with sudo if needed)
+ -i, --ignore-errors Continue on download errors, for example to
+ skip unavailable videos in a playlist
+ --abort-on-error Abort downloading of further videos (in the
+ playlist or the command line) if an error
+ occurs
+ --dump-user-agent Display the current browser identification
+ --list-extractors List all supported extractors
+ --extractor-descriptions Output descriptions of all supported
+ extractors
+ --force-generic-extractor Force extraction to use the generic
+ extractor
+ --default-search PREFIX Use this prefix for unqualified URLs. For
+ example "gvsearch2:" downloads two videos
+ from google videos for youtube-dl "large
+ apple". Use the value "auto" to let
+ youtube-dl guess ("auto_warning" to emit a
+ warning when guessing). "error" just throws
+ an error. The default value "fixup_error"
+ repairs broken URLs, but emits an error if
+ this is not possible instead of searching.
+ --ignore-config Do not read configuration files. When given
+ in the global configuration file
+ /etc/youtube-dl.conf: Do not read the user
+ configuration in ~/.config/youtube-
+ dl/config (%APPDATA%/youtube-dl/config.txt
+ on Windows)
+ --config-location PATH Location of the configuration file; either
+ the path to the config or its containing
+ directory.
+ --flat-playlist Do not extract the videos of a playlist,
+ only list them.
+ --mark-watched Mark videos watched (YouTube only)
+ --no-mark-watched Do not mark videos watched (YouTube only)
+ --no-color Do not emit color codes in output
+
+
+Network Options:
+
+ --proxy URL Use the specified HTTP/HTTPS/SOCKS proxy.
+ To enable experimental SOCKS proxy, specify
+ a proper scheme. For example
+ socks5://127.0.0.1:1080/. Pass in an empty
+ string (--proxy "") for direct connection
+ --socket-timeout SECONDS Time to wait before giving up, in seconds
+ --source-address IP Client-side IP address to bind to
+ -4, --force-ipv4 Make all connections via IPv4
+ -6, --force-ipv6 Make all connections via IPv6
+ --geo-verification-proxy URL Use this proxy to verify the IP address for
+ some geo-restricted sites. The default
+ proxy specified by --proxy (or none, if the
+ options is not present) is used for the
+ actual downloading.
+
+
+Video Selection:
+
+ --playlist-start NUMBER Playlist video to start at (default is 1)
+ --playlist-end NUMBER Playlist video to end at (default is last)
+ --playlist-items ITEM_SPEC Playlist video items to download. Specify
+ indices of the videos in the playlist
+ separated by commas like: "--playlist-items
+ 1,2,5,8" if you want to download videos
+ indexed 1, 2, 5, 8 in the playlist. You can
+ specify range: "--playlist-items
+ 1-3,7,10-13", it will download the videos
+ at index 1, 2, 3, 7, 10, 11, 12 and 13.
+ --match-title REGEX Download only matching titles (regex or
+ caseless sub-string)
+ --reject-title REGEX Skip download for matching titles (regex or
+ caseless sub-string)
+ --max-downloads NUMBER Abort after downloading NUMBER files
+ --min-filesize SIZE Do not download any videos smaller than
+ SIZE (e.g. 50k or 44.6m)
+ --max-filesize SIZE Do not download any videos larger than SIZE
+ (e.g. 50k or 44.6m)
+ --date DATE Download only videos uploaded in this date
+ --datebefore DATE Download only videos uploaded on or before
+ this date (i.e. inclusive)
+ --dateafter DATE Download only videos uploaded on or after
+ this date (i.e. inclusive)
+ --min-views COUNT Do not download any videos with less than
+ COUNT views
+ --max-views COUNT Do not download any videos with more than
+ COUNT views
+ --match-filter FILTER Generic video filter. Specify any key (see
+ help for -o for a list of available keys)
+ to match if the key is present, !key to
+ check if the key is not present,key >
+ NUMBER (like "comment_count > 12", also
+ works with >=, <, <=, !=, =) to compare
+ against a number, and & to require multiple
+ matches. Values which are not known are
+ excluded unless you put a question mark (?)
+ after the operator.For example, to only
+ match videos that have been liked more than
+ 100 times and disliked less than 50 times
+ (or the dislike functionality is not
+ available at the given service), but who
+ also have a description, use --match-filter
+ "like_count > 100 & dislike_count <? 50 &
+ description" .
+ --no-playlist Download only the video, if the URL refers
+ to a video and a playlist.
+ --yes-playlist Download the playlist, if the URL refers to
+ a video and a playlist.
+ --age-limit YEARS Download only videos suitable for the given
+ age
+ --download-archive FILE Download only videos not listed in the
+ archive file. Record the IDs of all
+ downloaded videos in it.
+ --include-ads Download advertisements as well
+ (experimental)
+
+
+Download Options:
+
+ -r, --limit-rate RATE Maximum download rate in bytes per second
+ (e.g. 50K or 4.2M)
+ -R, --retries RETRIES Number of retries (default is 10), or
+ "infinite".
+ --fragment-retries RETRIES Number of retries for a fragment (default
+ is 10), or "infinite" (DASH and hlsnative
+ only)
+ --skip-unavailable-fragments Skip unavailable fragments (DASH and
+ hlsnative only)
+ --abort-on-unavailable-fragment Abort downloading when some fragment is not
+ available
+ --buffer-size SIZE Size of download buffer (e.g. 1024 or 16K)
+ (default is 1024)
+ --no-resize-buffer Do not automatically adjust the buffer
+ size. By default, the buffer size is
+ automatically resized from an initial value
+ of SIZE.
+ --playlist-reverse Download playlist videos in reverse order
+ --playlist-random Download playlist videos in random order
+ --xattr-set-filesize Set file xattribute ytdl.filesize with
+ expected file size (experimental)
+ --hls-prefer-native Use the native HLS downloader instead of
+ ffmpeg
+ --hls-prefer-ffmpeg Use ffmpeg instead of the native HLS
+ downloader
+ --hls-use-mpegts Use the mpegts container for HLS videos,
+ allowing to play the video while
+ downloading (some players may not be able
+ to play it)
+ --external-downloader COMMAND Use the specified external downloader.
+ Currently supports
+ aria2c,avconv,axel,curl,ffmpeg,httpie,wget
+ --external-downloader-args ARGS Give these arguments to the external
+ downloader
+
+
+Filesystem Options:
+
+ -a, --batch-file FILE File containing URLs to download ('-' for
+ stdin)
+ --id Use only video ID in file name
+ -o, --output TEMPLATE Output filename template, see the "OUTPUT
+ TEMPLATE" for all the info
+ --autonumber-size NUMBER Specify the number of digits in
+ %(autonumber)s when it is present in output
+ filename template or --auto-number option
+ is given (default is 5)
+ --autonumber-start NUMBER Specify the start value for %(autonumber)s
+ (default is 1)
+ --restrict-filenames Restrict filenames to only ASCII
+ characters, and avoid "&" and spaces in
+ filenames
+ -A, --auto-number [deprecated; use -o
+ "%(autonumber)s-%(title)s.%(ext)s" ] Number
+ downloaded files starting from 00000
+ -t, --title [deprecated] Use title in file name
+ (default)
+ -l, --literal [deprecated] Alias of --title
+ -w, --no-overwrites Do not overwrite files
+ -c, --continue Force resume of partially downloaded files.
+ By default, youtube-dl will resume
+ downloads if possible.
+ --no-continue Do not resume partially downloaded files
+ (restart from beginning)
+ --no-part Do not use .part files - write directly
+ into output file
+ --no-mtime Do not use the Last-modified header to set
+ the file modification time
+ --write-description Write video description to a .description
+ file
+ --write-info-json Write video metadata to a .info.json file
+ --write-annotations Write video annotations to a
+ .annotations.xml file
+ --load-info-json FILE JSON file containing the video information
+ (created with the "--write-info-json"
+ option)
+ --cookies FILE File to read cookies from and dump cookie
+ jar in
+ --cache-dir DIR Location in the filesystem where youtube-dl
+ can store some downloaded information
+ permanently. By default
+ $XDG_CACHE_HOME/youtube-dl or
+ ~/.cache/youtube-dl . At the moment, only
+ YouTube player files (for videos with
+ obfuscated signatures) are cached, but that
+ may change.
+ --no-cache-dir Disable filesystem caching
+ --rm-cache-dir Delete all filesystem cache files
+
+
+Thumbnail images:
+
+ --write-thumbnail Write thumbnail image to disk
+ --write-all-thumbnails Write all thumbnail image formats to disk
+ --list-thumbnails Simulate and list all available thumbnail
+ formats
+
+
+Verbosity / Simulation Options:
+
+ -q, --quiet Activate quiet mode
+ --no-warnings Ignore warnings
+ -s, --simulate Do not download the video and do not write
+ anything to disk
+ --skip-download Do not download the video
+ -g, --get-url Simulate, quiet but print URL
+ -e, --get-title Simulate, quiet but print title
+ --get-id Simulate, quiet but print id
+ --get-thumbnail Simulate, quiet but print thumbnail URL
+ --get-description Simulate, quiet but print video description
+ --get-duration Simulate, quiet but print video length
+ --get-filename Simulate, quiet but print output filename
+ --get-format Simulate, quiet but print output format
+ -j, --dump-json Simulate, quiet but print JSON information.
+ See --output for a description of available
+ keys.
+ -J, --dump-single-json Simulate, quiet but print JSON information
+ for each command-line argument. If the URL
+ refers to a playlist, dump the whole
+ playlist information in a single line.
+ --print-json Be quiet and print the video information as
+ JSON (video is still being downloaded).
+ --newline Output progress bar as new lines
+ --no-progress Do not print progress bar
+ --console-title Display progress in console titlebar
+ -v, --verbose Print various debugging information
+ --dump-pages Print downloaded pages encoded using base64
+ to debug problems (very verbose)
+ --write-pages Write downloaded intermediary pages to
+ files in the current directory to debug
+ problems
+ --print-traffic Display sent and read HTTP traffic
+ -C, --call-home Contact the youtube-dl server for debugging
+ --no-call-home Do NOT contact the youtube-dl server for
+ debugging
+
+
+Workarounds:
+
+ --encoding ENCODING Force the specified encoding (experimental)
+ --no-check-certificate Suppress HTTPS certificate validation
+ --prefer-insecure Use an unencrypted connection to retrieve
+ information about the video. (Currently
+ supported only for YouTube)
+ --user-agent UA Specify a custom user agent
+ --referer URL Specify a custom referer, use if the video
+ access is restricted to one domain
+ --add-header FIELD:VALUE Specify a custom HTTP header and its value,
+ separated by a colon ':'. You can use this
+ option multiple times
+ --bidi-workaround Work around terminals that lack
+ bidirectional text support. Requires bidiv
+ or fribidi executable in PATH
+ --sleep-interval SECONDS Number of seconds to sleep before each
+ download when used alone or a lower bound
+ of a range for randomized sleep before each
+ download (minimum possible number of
+ seconds to sleep) when used along with
+ --max-sleep-interval.
+ --max-sleep-interval SECONDS Upper bound of a range for randomized sleep
+ before each download (maximum possible
+ number of seconds to sleep). Must only be
+ used along with --min-sleep-interval.
+
+
+Video Format Options:
+
+ -f, --format FORMAT Video format code, see the "FORMAT
+ SELECTION" for all the info
+ --all-formats Download all available video formats
+ --prefer-free-formats Prefer free video formats unless a specific
+ one is requested
+ -F, --list-formats List all available formats of requested
+ videos
+ --youtube-skip-dash-manifest Do not download the DASH manifests and
+ related data on YouTube videos
+ --merge-output-format FORMAT If a merge is required (e.g.
+ bestvideo+bestaudio), output to given
+ container format. One of mkv, mp4, ogg,
+ webm, flv. Ignored if no merge is required
+
+
+Subtitle Options:
+
+ --write-sub Write subtitle file
+ --write-auto-sub Write automatically generated subtitle file
+ (YouTube only)
+ --all-subs Download all the available subtitles of the
+ video
+ --list-subs List all available subtitles for the video
+ --sub-format FORMAT Subtitle format, accepts formats
+ preference, for example: "srt" or
+ "ass/srt/best"
+ --sub-lang LANGS Languages of the subtitles to download
+ (optional) separated by commas, use --list-
+ subs for available language tags
+
+
+Authentication Options:
+
+ -u, --username USERNAME Login with this account ID
+ -p, --password PASSWORD Account password. If this option is left
+ out, youtube-dl will ask interactively.
+ -2, --twofactor TWOFACTOR Two-factor authentication code
+ -n, --netrc Use .netrc authentication data
+ --video-password PASSWORD Video password (vimeo, smotri, youku)
+
+
+Adobe Pass Options:
+
+ --ap-mso MSO Adobe Pass multiple-system operator (TV
+ provider) identifier, use --ap-list-mso for
+ a list of available MSOs
+ --ap-username USERNAME Multiple-system operator account login
+ --ap-password PASSWORD Multiple-system operator account password.
+ If this option is left out, youtube-dl will
+ ask interactively.
+ --ap-list-mso List all supported multiple-system
+ operators
+
+
+Post-processing Options:
+
+ -x, --extract-audio Convert video files to audio-only files
+ (requires ffmpeg or avconv and ffprobe or
+ avprobe)
+ --audio-format FORMAT Specify audio format: "best", "aac",
+ "vorbis", "mp3", "m4a", "opus", or "wav";
+ "best" by default; No effect without -x
+ --audio-quality QUALITY Specify ffmpeg/avconv audio quality, insert
+ a value between 0 (better) and 9 (worse)
+ for VBR or a specific bitrate like 128K
+ (default 5)
+ --recode-video FORMAT Encode the video to another format if
+ necessary (currently supported:
+ mp4|flv|ogg|webm|mkv|avi)
+ --postprocessor-args ARGS Give these arguments to the postprocessor
+ -k, --keep-video Keep the video file on disk after the post-
+ processing; the video is erased by default
+ --no-post-overwrites Do not overwrite post-processed files; the
+ post-processed files are overwritten by
+ default
+ --embed-subs Embed subtitles in the video (only for mp4,
+ webm and mkv videos)
+ --embed-thumbnail Embed thumbnail in the audio as cover art
+ --add-metadata Write metadata to the video file
+ --metadata-from-title FORMAT Parse additional metadata like song title /
+ artist from the video title. The format
+ syntax is the same as --output, the parsed
+ parameters replace existing values.
+ Additional templates: %(album)s,
+ %(artist)s. Example: --metadata-from-title
+ "%(artist)s - %(title)s" matches a title
+ like "Coldplay - Paradise"
+ --xattrs Write metadata to the video file's xattrs
+ (using dublin core and xdg standards)
+ --fixup POLICY Automatically correct known faults of the
+ file. One of never (do nothing), warn (only
+ emit a warning), detect_or_warn (the
+ default; fix file if we can, warn
+ otherwise)
+ --prefer-avconv Prefer avconv over ffmpeg for running the
+ postprocessors (default)
+ --prefer-ffmpeg Prefer ffmpeg over avconv for running the
+ postprocessors
+ --ffmpeg-location PATH Location of the ffmpeg/avconv binary;
+ either the path to the binary or its
+ containing directory.
+ --exec CMD Execute a command on the file after
+ downloading, similar to find's -exec
+ syntax. Example: --exec 'adb push {}
+ /sdcard/Music/ && rm {}'
+ --convert-subs FORMAT Convert the subtitles to other format
+ (currently supported: srt|ass|vtt)
+
+
+
+CONFIGURATION
+
+
+You can configure youtube-dl by placing any supported command line
+option to a configuration file. On Linux and OS X, the system wide
+configuration file is located at /etc/youtube-dl.conf and the user wide
+configuration file at ~/.config/youtube-dl/config. On Windows, the user
+wide configuration file locations are %APPDATA%\youtube-dl\config.txt or
+C:\Users\<user name>\youtube-dl.conf. Note that by default configuration
+file may not exist so you may need to create it yourself.
+
+For example, with the following configuration file youtube-dl will
+always extract the audio, not copy the mtime, use a proxy and save all
+videos under Movies directory in your home directory:
+
+ # Lines starting with # are comments
+
+ # Always extract audio
+ -x
+
+ # Do not copy the mtime
+ --no-mtime
+
+ # Use this proxy
+ --proxy 127.0.0.1:3128
+
+ # Save all videos under Movies directory in your home directory
+ -o ~/Movies/%(title)s.%(ext)s
+
+Note that options in configuration file are just the same options aka
+switches used in regular command line calls thus there MUST BE NO
+WHITESPACE after - or --, e.g. -o or --proxy but not - o or -- proxy.
+
+You can use --ignore-config if you want to disable the configuration
+file for a particular youtube-dl run.
+
+You can also use --config-location if you want to use custom
+configuration file for a particular youtube-dl run.
+
+Authentication with .netrc file
+
+You may also want to configure automatic credentials storage for
+extractors that support authentication (by providing login and password
+with --username and --password) in order not to pass credentials as
+command line arguments on every youtube-dl execution and prevent
+tracking plain text passwords in the shell command history. You can
+achieve this using a .netrc file on a per extractor basis. For that you
+will need to create a .netrc file in your $HOME and restrict permissions
+to read/write by only you:
+
+ touch $HOME/.netrc
+ chmod a-rwx,u+rw $HOME/.netrc
+
+After that you can add credentials for an extractor in the following
+format, where _extractor_ is the name of the extractor in lowercase:
+
+ machine <extractor> login <login> password <password>
+
+For example:
+
+ machine youtube login myaccount@gmail.com password my_youtube_password
+ machine twitch login my_twitch_account_name password my_twitch_password
+
+To activate authentication with the .netrc file you should pass --netrc
+to youtube-dl or place it in the configuration file.
+
+On Windows you may also need to setup the %HOME% environment variable
+manually.
+
+
+
+OUTPUT TEMPLATE
+
+
+The -o option allows users to indicate a template for the output file
+names.
+
+TL;DR: navigate me to examples.
+
+The basic usage is not to set any template arguments when downloading a
+single file, like in youtube-dl -o funny_video.flv "http://some/video".
+However, it may contain special sequences that will be replaced when
+downloading each video. The special sequences have the format %(NAME)s.
+To clarify, that is a percent symbol followed by a name in parentheses,
+followed by a lowercase S. Allowed names are:
+
+- id: Video identifier
+- title: Video title
+- url: Video URL
+- ext: Video filename extension
+- alt_title: A secondary title of the video
+- display_id: An alternative identifier for the video
+- uploader: Full name of the video uploader
+- license: License name the video is licensed under
+- creator: The creator of the video
+- release_date: The date (YYYYMMDD) when the video was released
+- timestamp: UNIX timestamp of the moment the video became available
+- upload_date: Video upload date (YYYYMMDD)
+- uploader_id: Nickname or id of the video uploader
+- location: Physical location where the video was filmed
+- duration: Length of the video in seconds
+- view_count: How many users have watched the video on the platform
+- like_count: Number of positive ratings of the video
+- dislike_count: Number of negative ratings of the video
+- repost_count: Number of reposts of the video
+- average_rating: Average rating give by users, the scale used depends
+ on the webpage
+- comment_count: Number of comments on the video
+- age_limit: Age restriction for the video (years)
+- format: A human-readable description of the format
+- format_id: Format code specified by --format
+- format_note: Additional info about the format
+- width: Width of the video
+- height: Height of the video
+- resolution: Textual description of width and height
+- tbr: Average bitrate of audio and video in KBit/s
+- abr: Average audio bitrate in KBit/s
+- acodec: Name of the audio codec in use
+- asr: Audio sampling rate in Hertz
+- vbr: Average video bitrate in KBit/s
+- fps: Frame rate
+- vcodec: Name of the video codec in use
+- container: Name of the container format
+- filesize: The number of bytes, if known in advance
+- filesize_approx: An estimate for the number of bytes
+- protocol: The protocol that will be used for the actual download
+- extractor: Name of the extractor
+- extractor_key: Key name of the extractor
+- epoch: Unix epoch when creating the file
+- autonumber: Five-digit number that will be increased with each
+ download, starting at zero
+- playlist: Name or id of the playlist that contains the video
+- playlist_index: Index of the video in the playlist padded with
+ leading zeros according to the total length of the playlist
+- playlist_id: Playlist identifier
+- playlist_title: Playlist title
+
+Available for the video that belongs to some logical chapter or section:
+- chapter: Name or title of the chapter the video belongs to -
+chapter_number: Number of the chapter the video belongs to - chapter_id:
+Id of the chapter the video belongs to
+
+Available for the video that is an episode of some series or programme:
+- series: Title of the series or programme the video episode belongs to
+- season: Title of the season the video episode belongs to -
+season_number: Number of the season the video episode belongs to -
+season_id: Id of the season the video episode belongs to - episode:
+Title of the video episode - episode_number: Number of the video episode
+within a season - episode_id: Id of the video episode
+
+Available for the media that is a track or a part of a music album: -
+track: Title of the track - track_number: Number of the track within an
+album or a disc - track_id: Id of the track - artist: Artist(s) of the
+track - genre: Genre(s) of the track - album: Title of the album the
+track belongs to - album_type: Type of the album - album_artist: List of
+all artists appeared on the album - disc_number: Number of the disc or
+other physical medium the track belongs to - release_year: Year (YYYY)
+when the album was released
+
+Each aforementioned sequence when referenced in an output template will
+be replaced by the actual value corresponding to the sequence name. Note
+that some of the sequences are not guaranteed to be present since they
+depend on the metadata obtained by a particular extractor. Such
+sequences will be replaced with NA.
+
+For example for -o %(title)s-%(id)s.%(ext)s and an mp4 video with title
+youtube-dl test video and id BaW_jenozKcj, this will result in a
+youtube-dl test video-BaW_jenozKcj.mp4 file created in the current
+directory.
+
+Output templates can also contain arbitrary hierarchical path, e.g.
+-o '%(playlist)s/%(playlist_index)s - %(title)s.%(ext)s' which will
+result in downloading each video in a directory corresponding to this
+path template. Any missing directory will be automatically created for
+you.
+
+To use percent literals in an output template use %%. To output to
+stdout use -o -.
+
+The current default template is %(title)s-%(id)s.%(ext)s.
+
+In some cases, you don't want special characters such as 中, spaces, or
+&, such as when transferring the downloaded filename to a Windows system
+or the filename through an 8bit-unsafe channel. In these cases, add the
+--restrict-filenames flag to get a shorter title:
+
+Output template and Windows batch files
+
+If you are using an output template inside a Windows batch file then you
+must escape plain percent characters (%) by doubling, so that
+-o "%(title)s-%(id)s.%(ext)s" should become
+-o "%%(title)s-%%(id)s.%%(ext)s". However you should not touch %'s that
+are not plain characters, e.g. environment variables for expansion
+should stay intact: -o "C:\%HOMEPATH%\Desktop\%%(title)s.%%(ext)s".
+
+Output template examples
+
+Note on Windows you may need to use double quotes instead of single.
+
+ $ youtube-dl --get-filename -o '%(title)s.%(ext)s' BaW_jenozKc
+ youtube-dl test video ''_ä↭𝕐.mp4 # All kinds of weird characters
+
+ $ youtube-dl --get-filename -o '%(title)s.%(ext)s' BaW_jenozKc --restrict-filenames
+ youtube-dl_test_video_.mp4 # A simple file name
+
+ # Download YouTube playlist videos in separate directory indexed by video order in a playlist
+ $ youtube-dl -o '%(playlist)s/%(playlist_index)s - %(title)s.%(ext)s' https://www.youtube.com/playlist?list=PLwiyx1dc3P2JR9N8gQaQN_BCvlSlap7re
+
+ # Download all playlists of YouTube channel/user keeping each playlist in separate directory:
+ $ youtube-dl -o '%(uploader)s/%(playlist)s/%(playlist_index)s - %(title)s.%(ext)s' https://www.youtube.com/user/TheLinuxFoundation/playlists
+
+ # Download Udemy course keeping each chapter in separate directory under MyVideos directory in your home
+ $ youtube-dl -u user -p password -o '~/MyVideos/%(playlist)s/%(chapter_number)s - %(chapter)s/%(title)s.%(ext)s' https://www.udemy.com/java-tutorial/
+
+ # Download entire series season keeping each series and each season in separate directory under C:/MyVideos
+ $ youtube-dl -o "C:/MyVideos/%(series)s/%(season_number)s - %(season)s/%(episode_number)s - %(episode)s.%(ext)s" http://videomore.ru/kino_v_detalayah/5_sezon/367617
+
+ # Stream the video being downloaded to stdout
+ $ youtube-dl -o - BaW_jenozKc
+
+
+
+FORMAT SELECTION
+
+
+By default youtube-dl tries to download the best available quality, i.e.
+if you want the best quality you DON'T NEED to pass any special options,
+youtube-dl will guess it for you by DEFAULT.
+
+But sometimes you may want to download in a different format, for
+example when you are on a slow or intermittent connection. The key
+mechanism for achieving this is so-called _format selection_ based on
+which you can explicitly specify desired format, select formats based on
+some criterion or criteria, setup precedence and much more.
+
+The general syntax for format selection is --format FORMAT or shorter
+-f FORMAT where FORMAT is a _selector expression_, i.e. an expression
+that describes format or formats you would like to download.
+
+TL;DR: navigate me to examples.
+
+The simplest case is requesting a specific format, for example with
+-f 22 you can download the format with format code equal to 22. You can
+get the list of available format codes for particular video using
+--list-formats or -F. Note that these format codes are extractor
+specific.
+
+You can also use a file extension (currently 3gp, aac, flv, m4a, mp3,
+mp4, ogg, wav, webm are supported) to download the best quality format
+of a particular file extension served as a single file, e.g. -f webm
+will download the best quality format with the webm extension served as
+a single file.
+
+You can also use special names to select particular edge case formats: -
+best: Select the best quality format represented by a single file with
+video and audio. - worst: Select the worst quality format represented by
+a single file with video and audio. - bestvideo: Select the best quality
+video-only format (e.g. DASH video). May not be available. - worstvideo:
+Select the worst quality video-only format. May not be available. -
+bestaudio: Select the best quality audio only-format. May not be
+available. - worstaudio: Select the worst quality audio only-format. May
+not be available.
+
+For example, to download the worst quality video-only format you can use
+-f worstvideo.
+
+If you want to download multiple videos and they don't have the same
+formats available, you can specify the order of preference using
+slashes. Note that slash is left-associative, i.e. formats on the left
+hand side are preferred, for example -f 22/17/18 will download format 22
+if it's available, otherwise it will download format 17 if it's
+available, otherwise it will download format 18 if it's available,
+otherwise it will complain that no suitable formats are available for
+download.
+
+If you want to download several formats of the same video use a comma as
+a separator, e.g. -f 22,17,18 will download all these three formats, of
+course if they are available. Or a more sophisticated example combined
+with the precedence feature: -f 136/137/mp4/bestvideo,140/m4a/bestaudio.
+
+You can also filter the video formats by putting a condition in
+brackets, as in -f "best[height=720]" (or -f "[filesize>10M]").
+
+The following numeric meta fields can be used with comparisons <, <=, >,
+>=, = (equals), != (not equals): - filesize: The number of bytes, if
+known in advance - width: Width of the video, if known - height: Height
+of the video, if known - tbr: Average bitrate of audio and video in
+KBit/s - abr: Average audio bitrate in KBit/s - vbr: Average video
+bitrate in KBit/s - asr: Audio sampling rate in Hertz - fps: Frame rate
+
+Also filtering work for comparisons = (equals), != (not equals), ^=
+(begins with), $= (ends with), *= (contains) and following string meta
+fields: - ext: File extension - acodec: Name of the audio codec in use -
+vcodec: Name of the video codec in use - container: Name of the
+container format - protocol: The protocol that will be used for the
+actual download, lower-case (http, https, rtsp, rtmp, rtmpe, mms, f4m,
+ism, m3u8, or m3u8_native) - format_id: A short description of the
+format
+
+Note that none of the aforementioned meta fields are guaranteed to be
+present since this solely depends on the metadata obtained by particular
+extractor, i.e. the metadata offered by the video hoster.
+
+Formats for which the value is not known are excluded unless you put a
+question mark (?) after the operator. You can combine format filters, so
+-f "[height <=? 720][tbr>500]" selects up to 720p videos (or videos
+where the height is not known) with a bitrate of at least 500 KBit/s.
+
+You can merge the video and audio of two formats into a single file
+using -f <video-format>+<audio-format> (requires ffmpeg or avconv
+installed), for example -f bestvideo+bestaudio will download the best
+video-only format, the best audio-only format and mux them together with
+ffmpeg/avconv.
+
+Format selectors can also be grouped using parentheses, for example if
+you want to download the best mp4 and webm formats with a height lower
+than 480 you can use -f '(mp4,webm)[height<480]'.
+
+Since the end of April 2015 and version 2015.04.26, youtube-dl uses
+-f bestvideo+bestaudio/best as the default format selection (see #5447,
+#5456). If ffmpeg or avconv are installed this results in downloading
+bestvideo and bestaudio separately and muxing them together into a
+single file giving the best overall quality available. Otherwise it
+falls back to best and results in downloading the best available quality
+served as a single file. best is also needed for videos that don't come
+from YouTube because they don't provide the audio and video in two
+different files. If you want to only download some DASH formats (for
+example if you are not interested in getting videos with a resolution
+higher than 1080p), you can add
+-f bestvideo[height<=?1080]+bestaudio/best to your configuration file.
+Note that if you use youtube-dl to stream to stdout (and most likely to
+pipe it to your media player then), i.e. you explicitly specify output
+template as -o -, youtube-dl still uses -f best format selection in
+order to start content delivery immediately to your player and not to
+wait until bestvideo and bestaudio are downloaded and muxed.
+
+If you want to preserve the old format selection behavior (prior to
+youtube-dl 2015.04.26), i.e. you want to download the best available
+quality media served as a single file, you should explicitly specify
+your choice with -f best. You may want to add it to the configuration
+file in order not to type it every time you run youtube-dl.
+
+Format selection examples
+
+Note on Windows you may need to use double quotes instead of single.
+
+ # Download best mp4 format available or any other best if no mp4 available
+ $ youtube-dl -f 'bestvideo[ext=mp4]+bestaudio[ext=m4a]/best[ext=mp4]/best'
+
+ # Download best format available but not better that 480p
+ $ youtube-dl -f 'bestvideo[height<=480]+bestaudio/best[height<=480]'
+
+ # Download best video only format but no bigger than 50 MB
+ $ youtube-dl -f 'best[filesize<50M]'
+
+ # Download best format available via direct link over HTTP/HTTPS protocol
+ $ youtube-dl -f '(bestvideo+bestaudio/best)[protocol^=http]'
+
+ # Download the best video format and the best audio format without merging them
+ $ youtube-dl -f 'bestvideo,bestaudio' -o '%(title)s.f%(format_id)s.%(ext)s'
+
+Note that in the last example, an output template is recommended as
+bestvideo and bestaudio may have the same file name.
+
+
+
+VIDEO SELECTION
+
+
+Videos can be filtered by their upload date using the options --date,
+--datebefore or --dateafter. They accept dates in two formats:
+
+- Absolute dates: Dates in the format YYYYMMDD.
+- Relative dates: Dates in the format
+ (now|today)[+-][0-9](day|week|month|year)(s)?
+
+Examples:
+
+ # Download only the videos uploaded in the last 6 months
+ $ youtube-dl --dateafter now-6months
+
+ # Download only the videos uploaded on January 1, 1970
+ $ youtube-dl --date 19700101
+
+ $ # Download only the videos uploaded in the 200x decade
+ $ youtube-dl --dateafter 20000101 --datebefore 20091231
+
+
+
+FAQ
+
+
+How do I update youtube-dl?
+
+If you've followed our manual installation instructions, you can simply
+run youtube-dl -U (or, on Linux, sudo youtube-dl -U).
+
+If you have used pip, a simple sudo pip install -U youtube-dl is
+sufficient to update.
+
+If you have installed youtube-dl using a package manager like _apt-get_
+or _yum_, use the standard system update mechanism to update. Note that
+distribution packages are often outdated. As a rule of thumb, youtube-dl
+releases at least once a month, and often weekly or even daily. Simply
+go to http://yt-dl.org/ to find out the current version. Unfortunately,
+there is nothing we youtube-dl developers can do if your distribution
+serves a really outdated version. You can (and should) complain to your
+distribution in their bugtracker or support forum.
+
+As a last resort, you can also uninstall the version installed by your
+package manager and follow our manual installation instructions. For
+that, remove the distribution's package, with a line like
+
+ sudo apt-get remove -y youtube-dl
+
+Afterwards, simply follow our manual installation instructions:
+
+ sudo wget https://yt-dl.org/latest/youtube-dl -O /usr/local/bin/youtube-dl
+ sudo chmod a+x /usr/local/bin/youtube-dl
+ hash -r
+
+Again, from then on you'll be able to update with sudo youtube-dl -U.
+
+youtube-dl is extremely slow to start on Windows
+
+Add a file exclusion for youtube-dl.exe in Windows Defender settings.
+
+I'm getting an error Unable to extract OpenGraph title on YouTube playlists
+
+YouTube changed their playlist format in March 2014 and later on, so
+you'll need at least youtube-dl 2014.07.25 to download all YouTube
+videos.
+
+If you have installed youtube-dl with a package manager, pip, setup.py
+or a tarball, please use that to update. Note that Ubuntu packages do
+not seem to get updated anymore. Since we are not affiliated with
+Ubuntu, there is little we can do. Feel free to report bugs to the
+Ubuntu packaging people - all they have to do is update the package to a
+somewhat recent version. See above for a way to update.
+
+I'm getting an error when trying to use output template: error: using output template conflicts with using title, video ID or auto number
+
+Make sure you are not using -o with any of these options -t, --title,
+--id, -A or --auto-number set in command line or in a configuration
+file. Remove the latter if any.
+
+Do I always have to pass -citw?
+
+By default, youtube-dl intends to have the best options (incidentally,
+if you have a convincing case that these should be different, please
+file an issue where you explain that). Therefore, it is unnecessary and
+sometimes harmful to copy long option strings from webpages. In
+particular, the only option out of -citw that is regularly useful is -i.
+
+Can you please put the -b option back?
+
+Most people asking this question are not aware that youtube-dl now
+defaults to downloading the highest available quality as reported by
+YouTube, which will be 1080p or 720p in some cases, so you no longer
+need the -b option. For some specific videos, maybe YouTube does not
+report them to be available in a specific high quality format you're
+interested in. In that case, simply request it with the -f option and
+youtube-dl will try to download it.
+
+I get HTTP error 402 when trying to download a video. What's this?
+
+Apparently YouTube requires you to pass a CAPTCHA test if you download
+too much. We're considering to provide a way to let you solve the
+CAPTCHA, but at the moment, your best course of action is pointing a web
+browser to the youtube URL, solving the CAPTCHA, and restart youtube-dl.
+
+Do I need any other programs?
+
+youtube-dl works fine on its own on most sites. However, if you want to
+convert video/audio, you'll need avconv or ffmpeg. On some sites - most
+notably YouTube - videos can be retrieved in a higher quality format
+without sound. youtube-dl will detect whether avconv/ffmpeg is present
+and automatically pick the best option.
+
+Videos or video formats streamed via RTMP protocol can only be
+downloaded when rtmpdump is installed. Downloading MMS and RTSP videos
+requires either mplayer or mpv to be installed.
+
+I have downloaded a video but how can I play it?
+
+Once the video is fully downloaded, use any video player, such as mpv,
+vlc or mplayer.
+
+I extracted a video URL with -g, but it does not play on another machine / in my web browser.
+
+It depends a lot on the service. In many cases, requests for the video
+(to download/play it) must come from the same IP address and with the
+same cookies and/or HTTP headers. Use the --cookies option to write the
+required cookies into a file, and advise your downloader to read cookies
+from that file. Some sites also require a common user agent to be used,
+use --dump-user-agent to see the one in use by youtube-dl. You can also
+get necessary cookies and HTTP headers from JSON output obtained with
+--dump-json.
+
+It may be beneficial to use IPv6; in some cases, the restrictions are
+only applied to IPv4. Some services (sometimes only for a subset of
+videos) do not restrict the video URL by IP address, cookie, or
+user-agent, but these are the exception rather than the rule.
+
+Please bear in mind that some URL protocols are NOT supported by
+browsers out of the box, including RTMP. If you are using -g, your own
+downloader must support these as well.
+
+If you want to play the video on a machine that is not running
+youtube-dl, you can relay the video content from the machine that runs
+youtube-dl. You can use -o - to let youtube-dl stream a video to stdout,
+or simply allow the player to download the files written by youtube-dl
+in turn.
+
+ERROR: no fmt_url_map or conn information found in video info
+
+YouTube has switched to a new video info format in July 2011 which is
+not supported by old versions of youtube-dl. See above for how to update
+youtube-dl.
+
+ERROR: unable to download video
+
+YouTube requires an additional signature since September 2012 which is
+not supported by old versions of youtube-dl. See above for how to update
+youtube-dl.
+
+Video URL contains an ampersand and I'm getting some strange output [1] 2839 or 'v' is not recognized as an internal or external command
+
+That's actually the output from your shell. Since ampersand is one of
+the special shell characters it's interpreted by the shell preventing
+you from passing the whole URL to youtube-dl. To disable your shell from
+interpreting the ampersands (or any other special characters) you have
+to either put the whole URL in quotes or escape them with a backslash
+(which approach will work depends on your shell).
+
+For example if your URL is
+https://www.youtube.com/watch?t=4&v=BaW_jenozKc you should end up with
+following command:
+
+youtube-dl 'https://www.youtube.com/watch?t=4&v=BaW_jenozKc'
+
+or
+
+youtube-dl https://www.youtube.com/watch?t=4\&v=BaW_jenozKc
+
+For Windows you have to use the double quotes:
+
+youtube-dl "https://www.youtube.com/watch?t=4&v=BaW_jenozKc"
+
+ExtractorError: Could not find JS function u'OF'
+
+In February 2015, the new YouTube player contained a character sequence
+in a string that was misinterpreted by old versions of youtube-dl. See
+above for how to update youtube-dl.
+
+HTTP Error 429: Too Many Requests or 402: Payment Required
+
+These two error codes indicate that the service is blocking your IP
+address because of overuse. Contact the service and ask them to unblock
+your IP address, or - if you have acquired a whitelisted IP address
+already - use the --proxy or --source-address options to select another
+IP address.
+
+SyntaxError: Non-ASCII character
+
+The error
+
+ File "youtube-dl", line 2
+ SyntaxError: Non-ASCII character '\x93' ...
+
+means you're using an outdated version of Python. Please update to
+Python 2.6 or 2.7.
+
+What is this binary file? Where has the code gone?
+
+Since June 2012 (#342) youtube-dl is packed as an executable zipfile,
+simply unzip it (might need renaming to youtube-dl.zip first on some
+systems) or clone the git repository, as laid out above. If you modify
+the code, you can run it by executing the __main__.py file. To recompile
+the executable, run make youtube-dl.
+
+The exe throws an error due to missing MSVCR100.dll
+
+To run the exe you need to install first the Microsoft Visual C++ 2010
+Redistributable Package (x86).
+
+On Windows, how should I set up ffmpeg and youtube-dl? Where should I put the exe files?
+
+If you put youtube-dl and ffmpeg in the same directory that you're
+running the command from, it will work, but that's rather cumbersome.
+
+To make a different directory work - either for ffmpeg, or for
+youtube-dl, or for both - simply create the directory (say, C:\bin, or
+C:\Users\<User name>\bin), put all the executables directly in there,
+and then set your PATH environment variable to include that directory.
+
+From then on, after restarting your shell, you will be able to access
+both youtube-dl and ffmpeg (and youtube-dl will be able to find ffmpeg)
+by simply typing youtube-dl or ffmpeg, no matter what directory you're
+in.
+
+How do I put downloads into a specific folder?
+
+Use the -o to specify an output template, for example
+-o "/home/user/videos/%(title)s-%(id)s.%(ext)s". If you want this for
+all of your downloads, put the option into your configuration file.
+
+How do I download a video starting with a -?
+
+Either prepend http://www.youtube.com/watch?v= or separate the ID from
+the options with --:
+
+ youtube-dl -- -wNyEUrxzFU
+ youtube-dl "http://www.youtube.com/watch?v=-wNyEUrxzFU"
+
+How do I pass cookies to youtube-dl?
+
+Use the --cookies option, for example
+--cookies /path/to/cookies/file.txt.
+
+In order to extract cookies from browser use any conforming browser
+extension for exporting cookies. For example, cookies.txt (for Chrome)
+or Export Cookies (for Firefox).
+
+Note that the cookies file must be in Mozilla/Netscape format and the
+first line of the cookies file must be either # HTTP Cookie File or
+# Netscape HTTP Cookie File. Make sure you have correct newline format
+in the cookies file and convert newlines if necessary to correspond with
+your OS, namely CRLF (\r\n) for Windows and LF (\n) for Unix and
+Unix-like systems (Linux, Mac OS, etc.). HTTP Error 400: Bad Request
+when using --cookies is a good sign of invalid newline format.
+
+Passing cookies to youtube-dl is a good way to workaround login when a
+particular extractor does not implement it explicitly. Another use case
+is working around CAPTCHA some websites require you to solve in
+particular cases in order to get access (e.g. YouTube, CloudFlare).
+
+How do I stream directly to media player?
+
+You will first need to tell youtube-dl to stream media to stdout with
+-o -, and also tell your media player to read from stdin (it must be
+capable of this for streaming) and then pipe former to latter. For
+example, streaming to vlc can be achieved with:
+
+ youtube-dl -o - "http://www.youtube.com/watch?v=BaW_jenozKcj" | vlc -
+
+How do I download only new videos from a playlist?
+
+Use download-archive feature. With this feature you should initially
+download the complete playlist with
+--download-archive /path/to/download/archive/file.txt that will record
+identifiers of all the videos in a special file. Each subsequent run
+with the same --download-archive will download only new videos and skip
+all videos that have been downloaded before. Note that only successful
+downloads are recorded in the file.
+
+For example, at first,
+
+ youtube-dl --download-archive archive.txt "https://www.youtube.com/playlist?list=PLwiyx1dc3P2JR9N8gQaQN_BCvlSlap7re"
+
+will download the complete PLwiyx1dc3P2JR9N8gQaQN_BCvlSlap7re playlist
+and create a file archive.txt. Each subsequent run will only download
+new videos if any:
+
+ youtube-dl --download-archive archive.txt "https://www.youtube.com/playlist?list=PLwiyx1dc3P2JR9N8gQaQN_BCvlSlap7re"
+
+Should I add --hls-prefer-native into my config?
+
+When youtube-dl detects an HLS video, it can download it either with the
+built-in downloader or ffmpeg. Since many HLS streams are slightly
+invalid and ffmpeg/youtube-dl each handle some invalid cases better than
+the other, there is an option to switch the downloader if needed.
+
+When youtube-dl knows that one particular downloader works better for a
+given website, that downloader will be picked. Otherwise, youtube-dl
+will pick the best downloader for general compatibility, which at the
+moment happens to be ffmpeg. This choice may change in future versions
+of youtube-dl, with improvements of the built-in downloader and/or
+ffmpeg.
+
+In particular, the generic extractor (used when your website is not in
+the list of supported sites by youtube-dl cannot mandate one specific
+downloader.
+
+If you put either --hls-prefer-native or --hls-prefer-ffmpeg into your
+configuration, a different subset of videos will fail to download
+correctly. Instead, it is much better to file an issue or a pull request
+which details why the native or the ffmpeg HLS downloader is a better
+choice for your use case.
+
+Can you add support for this anime video site, or site which shows current movies for free?
+
+As a matter of policy (as well as legality), youtube-dl does not include
+support for services that specialize in infringing copyright. As a rule
+of thumb, if you cannot easily find a video that the service is quite
+obviously allowed to distribute (i.e. that has been uploaded by the
+creator, the creator's distributor, or is published under a free
+license), the service is probably unfit for inclusion to youtube-dl.
+
+A note on the service that they don't host the infringing content, but
+just link to those who do, is evidence that the service should NOT be
+included into youtube-dl. The same goes for any DMCA note when the whole
+front page of the service is filled with videos they are not allowed to
+distribute. A "fair use" note is equally unconvincing if the service
+shows copyright-protected videos in full without authorization.
+
+Support requests for services that DO purchase the rights to distribute
+their content are perfectly fine though. If in doubt, you can simply
+include a source that mentions the legitimate purchase of content.
+
+How can I speed up work on my issue?
+
+(Also known as: Help, my important issue not being solved!) The
+youtube-dl core developer team is quite small. While we do our best to
+solve as many issues as possible, sometimes that can take quite a while.
+To speed up your issue, here's what you can do:
+
+First of all, please do report the issue at our issue tracker. That
+allows us to coordinate all efforts by users and developers, and serves
+as a unified point. Unfortunately, the youtube-dl project has grown too
+large to use personal email as an effective communication channel.
+
+Please read the bug reporting instructions below. A lot of bugs lack all
+the necessary information. If you can, offer proxy, VPN, or shell access
+to the youtube-dl developers. If you are able to, test the issue from
+multiple computers in multiple countries to exclude local censorship or
+misconfiguration issues.
+
+If nobody is interested in solving your issue, you are welcome to take
+matters into your own hands and submit a pull request (or coerce/pay
+somebody else to do so).
+
+Feel free to bump the issue from time to time by writing a small comment
+("Issue is still present in youtube-dl version ...from France, but fixed
+from Belgium"), but please not more than once a month. Please do not
+declare your issue as important or urgent.
+
+How can I detect whether a given URL is supported by youtube-dl?
+
+For one, have a look at the list of supported sites. Note that it can
+sometimes happen that the site changes its URL scheme (say, from
+http://example.com/video/1234567 to http://example.com/v/1234567 ) and
+youtube-dl reports an URL of a service in that list as unsupported. In
+that case, simply report a bug.
+
+It is _not_ possible to detect whether a URL is supported or not. That's
+because youtube-dl contains a generic extractor which matches ALL URLs.
+You may be tempted to disable, exclude, or remove the generic extractor,
+but the generic extractor not only allows users to extract videos from
+lots of websites that embed a video from another service, but may also
+be used to extract video from a service that it's hosting itself.
+Therefore, we neither recommend nor support disabling, excluding, or
+removing the generic extractor.
+
+If you want to find out whether a given URL is supported, simply call
+youtube-dl with it. If you get no videos back, chances are the URL is
+either not referring to a video or unsupported. You can find out which
+by examining the output (if you run youtube-dl on the console) or
+catching an UnsupportedError exception if you run it from a Python
+program.
+
+
+
+WHY DO I NEED TO GO THROUGH THAT MUCH RED TAPE WHEN FILING BUGS?
+
+
+Before we had the issue template, despite our extensive bug reporting
+instructions, about 80% of the issue reports we got were useless, for
+instance because people used ancient versions hundreds of releases old,
+because of simple syntactic errors (not in youtube-dl but in general
+shell usage), because the problem was already reported multiple times
+before, because people did not actually read an error message, even if
+it said "please install ffmpeg", because people did not mention the URL
+they were trying to download and many more simple, easy-to-avoid
+problems, many of whom were totally unrelated to youtube-dl.
+
+youtube-dl is an open-source project manned by too few volunteers, so
+we'd rather spend time fixing bugs where we are certain none of those
+simple problems apply, and where we can be reasonably confident to be
+able to reproduce the issue without asking the reporter repeatedly. As
+such, the output of youtube-dl -v YOUR_URL_HERE is really all that's
+required to file an issue. The issue template also guides you through
+some basic steps you can do, such as checking that your version of
+youtube-dl is current.
+
+
+
+DEVELOPER INSTRUCTIONS
+
+
+Most users do not need to build youtube-dl and can download the builds
+or get them from their distribution.
+
+To run youtube-dl as a developer, you don't need to build anything
+either. Simply execute
+
+ python -m youtube_dl
+
+To run the test, simply invoke your favorite test runner, or execute a
+test file directly; any of the following work:
+
+ python -m unittest discover
+ python test/test_download.py
+ nosetests
+
+If you want to create a build of youtube-dl yourself, you'll need
+
+- python
+- make (only GNU make is supported)
+- pandoc
+- zip
+- nosetests
+
+Adding support for a new site
+
+If you want to add support for a new site, first of all MAKE SURE this
+site is NOT DEDICATED TO COPYRIGHT INFRINGEMENT. youtube-dl does NOT
+SUPPORT such sites thus pull requests adding support for them WILL BE
+REJECTED.
+
+After you have ensured this site is distributing its content legally,
+you can follow this quick list (assuming your service is called
+yourextractor):
+
+1. Fork this repository
+2. Check out the source code with:
+
+ git clone git@github.com:YOUR_GITHUB_USERNAME/youtube-dl.git
+
+3. Start a new git branch with
+
+ cd youtube-dl
+ git checkout -b yourextractor
+
+4. Start with this simple template and save it to
+ youtube_dl/extractor/yourextractor.py:
+
+ # coding: utf-8
+ from __future__ import unicode_literals
+
+ from .common import InfoExtractor
+
+
+ class YourExtractorIE(InfoExtractor):
+ _VALID_URL = r'https?://(?:www\.)?yourextractor\.com/watch/(?P<id>[0-9]+)'
+ _TEST = {
+ 'url': 'http://yourextractor.com/watch/42',
+ 'md5': 'TODO: md5 sum of the first 10241 bytes of the video file (use --test)',
+ 'info_dict': {
+ 'id': '42',
+ 'ext': 'mp4',
+ 'title': 'Video title goes here',
+ 'thumbnail': r're:^https?://.*\.jpg$',
+ # TODO more properties, either as:
+ # * A value
+ # * MD5 checksum; start the string with md5:
+ # * A regular expression; start the string with re:
+ # * Any Python type (for example int or float)
+ }
+ }
+
+ def _real_extract(self, url):
+ video_id = self._match_id(url)
+ webpage = self._download_webpage(url, video_id)
+
+ # TODO more code goes here, for example ...
+ title = self._html_search_regex(r'<h1>(.+?)</h1>', webpage, 'title')
+
+ return {
+ 'id': video_id,
+ 'title': title,
+ 'description': self._og_search_description(webpage),
+ 'uploader': self._search_regex(r'<div[^>]+id="uploader"[^>]*>([^<]+)<', webpage, 'uploader', fatal=False),
+ # TODO more properties (see youtube_dl/extractor/common.py)
+ }
+
+5. Add an import in youtube_dl/extractor/extractors.py.
+6. Run python test/test_download.py TestDownload.test_YourExtractor.
+ This _should fail_ at first, but you can continually re-run it until
+ you're done. If you decide to add more than one test, then rename
+ _TEST to _TESTS and make it into a list of dictionaries. The tests
+ will then be named TestDownload.test_YourExtractor,
+ TestDownload.test_YourExtractor_1,
+ TestDownload.test_YourExtractor_2, etc.
+7. Have a look at youtube_dl/extractor/common.py for possible helper
+ methods and a detailed description of what your extractor should and
+ may return. Add tests and code for as many as you want.
+8. Make sure your code follows youtube-dl coding conventions and check
+ the code with flake8. Also make sure your code works under all
+ Python versions claimed supported by youtube-dl, namely 2.6, 2.7,
+ and 3.2+.
+9. When the tests pass, add the new files and commit them and push the
+ result, like this:
+
+ $ git add youtube_dl/extractor/extractors.py
+ $ git add youtube_dl/extractor/yourextractor.py
+ $ git commit -m '[yourextractor] Add new extractor'
+ $ git push origin yourextractor
+
+10. Finally, create a pull request. We'll then review and merge it.
+
+In any case, thank you very much for your contributions!
+
+
+youtube-dl coding conventions
+
+This section introduces a guide lines for writing idiomatic, robust and
+future-proof extractor code.
+
+Extractors are very fragile by nature since they depend on the layout of
+the source data provided by 3rd party media hosters out of your control
+and this layout tends to change. As an extractor implementer your task
+is not only to write code that will extract media links and metadata
+correctly but also to minimize dependency on the source's layout and
+even to make the code foresee potential future changes and be ready for
+that. This is important because it will allow the extractor not to break
+on minor layout changes thus keeping old youtube-dl versions working.
+Even though this breakage issue is easily fixed by emitting a new
+version of youtube-dl with a fix incorporated, all the previous versions
+become broken in all repositories and distros' packages that may not be
+so prompt in fetching the update from us. Needless to say, some non
+rolling release distros may never receive an update at all.
+
+Mandatory and optional metafields
+
+For extraction to work youtube-dl relies on metadata your extractor
+extracts and provides to youtube-dl expressed by an information
+dictionary or simply _info dict_. Only the following meta fields in the
+_info dict_ are considered mandatory for a successful extraction process
+by youtube-dl:
+
+- id (media identifier)
+- title (media title)
+- url (media download URL) or formats
+
+In fact only the last option is technically mandatory (i.e. if you can't
+figure out the download location of the media the extraction does not
+make any sense). But by convention youtube-dl also treats id and title
+as mandatory. Thus the aforementioned metafields are the critical data
+that the extraction does not make any sense without and if any of them
+fail to be extracted then the extractor is considered completely broken.
+
+Any field apart from the aforementioned ones are considered OPTIONAL.
+That means that extraction should be TOLERANT to situations when sources
+for these fields can potentially be unavailable (even if they are always
+available at the moment) and FUTURE-PROOF in order not to break the
+extraction of general purpose mandatory fields.
+
+Example
+
+Say you have some source dictionary meta that you've fetched as JSON
+with HTTP request and it has a key summary:
+
+ meta = self._download_json(url, video_id)
+
+Assume at this point meta's layout is:
+
+ {
+ ...
+ "summary": "some fancy summary text",
+ ...
+ }
+
+Assume you want to extract summary and put it into the resulting info
+dict as description. Since description is an optional meta field you
+should be ready that this key may be missing from the meta dict, so that
+you should extract it like:
+
+ description = meta.get('summary') # correct
+
+and not like:
+
+ description = meta['summary'] # incorrect
+
+The latter will break extraction process with KeyError if summary
+disappears from meta at some later time but with the former approach
+extraction will just go ahead with description set to None which is
+perfectly fine (remember None is equivalent to the absence of data).
+
+Similarly, you should pass fatal=False when extracting optional data
+from a webpage with _search_regex, _html_search_regex or similar
+methods, for instance:
+
+ description = self._search_regex(
+ r'<span[^>]+id="title"[^>]*>([^<]+)<',
+ webpage, 'description', fatal=False)
+
+With fatal set to False if _search_regex fails to extract description it
+will emit a warning and continue extraction.
+
+You can also pass default=<some fallback value>, for example:
+
+ description = self._search_regex(
+ r'<span[^>]+id="title"[^>]*>([^<]+)<',
+ webpage, 'description', default=None)
+
+On failure this code will silently continue the extraction with
+description set to None. That is useful for metafields that may or may
+not be present.
+
+Provide fallbacks
+
+When extracting metadata try to do so from multiple sources. For example
+if title is present in several places, try extracting from at least some
+of them. This makes it more future-proof in case some of the sources
+become unavailable.
+
+Example
+
+Say meta from the previous example has a title and you are about to
+extract it. Since title is a mandatory meta field you should end up with
+something like:
+
+ title = meta['title']
+
+If title disappears from meta in future due to some changes on the
+hoster's side the extraction would fail since title is mandatory. That's
+expected.
+
+Assume that you have some another source you can extract title from, for
+example og:title HTML meta of a webpage. In this case you can provide a
+fallback scenario:
+
+ title = meta.get('title') or self._og_search_title(webpage)
+
+This code will try to extract from meta first and if it fails it will
+try extracting og:title from a webpage.
+
+Make regular expressions flexible
+
+When using regular expressions try to write them fuzzy and flexible.
+
+Example
+
+Say you need to extract title from the following HTML code:
+
+ <span style="position: absolute; left: 910px; width: 90px; float: right; z-index: 9999;" class="title">some fancy title</span>
+
+The code for that task should look similar to:
+
+ title = self._search_regex(
+ r'<span[^>]+class="title"[^>]*>([^<]+)', webpage, 'title')
+
+Or even better:
+
+ title = self._search_regex(
+ r'<span[^>]+class=(["\'])title\1[^>]*>(?P<title>[^<]+)',
+ webpage, 'title', group='title')
+
+Note how you tolerate potential changes in the style attribute's value
+or switch from using double quotes to single for class attribute:
+
+The code definitely should not look like:
+
+ title = self._search_regex(
+ r'<span style="position: absolute; left: 910px; width: 90px; float: right; z-index: 9999;" class="title">(.*?)</span>',
+ webpage, 'title', group='title')
+
+Use safe conversion functions
+
+Wrap all extracted numeric data into safe functions from utils:
+int_or_none, float_or_none. Use them for string to number conversions as
+well.
+
+
+
+EMBEDDING YOUTUBE-DL
+
+
+youtube-dl makes the best effort to be a good command-line program, and
+thus should be callable from any programming language. If you encounter
+any problems parsing its output, feel free to create a report.
+
+From a Python program, you can embed youtube-dl in a more powerful
+fashion, like this:
+
+ from __future__ import unicode_literals
+ import youtube_dl
+
+ ydl_opts = {}
+ with youtube_dl.YoutubeDL(ydl_opts) as ydl:
+ ydl.download(['http://www.youtube.com/watch?v=BaW_jenozKc'])
+
+Most likely, you'll want to use various options. For a list of options
+available, have a look at youtube_dl/YoutubeDL.py. For a start, if you
+want to intercept youtube-dl's output, set a logger object.
+
+Here's a more complete example of a program that outputs only errors
+(and a short message after the download is finished), and
+downloads/converts the video to an mp3 file:
+
+ from __future__ import unicode_literals
+ import youtube_dl
+
+
+ class MyLogger(object):
+ def debug(self, msg):
+ pass
+
+ def warning(self, msg):
+ pass
+
+ def error(self, msg):
+ print(msg)
+
+
+ def my_hook(d):
+ if d['status'] == 'finished':
+ print('Done downloading, now converting ...')
+
+
+ ydl_opts = {
+ 'format': 'bestaudio/best',
+ 'postprocessors': [{
+ 'key': 'FFmpegExtractAudio',
+ 'preferredcodec': 'mp3',
+ 'preferredquality': '192',
+ }],
+ 'logger': MyLogger(),
+ 'progress_hooks': [my_hook],
+ }
+ with youtube_dl.YoutubeDL(ydl_opts) as ydl:
+ ydl.download(['http://www.youtube.com/watch?v=BaW_jenozKc'])
+
+
+
+BUGS
+
+
+Bugs and suggestions should be reported at:
+https://github.com/rg3/youtube-dl/issues. Unless you were prompted to or
+there is another pertinent reason (e.g. GitHub fails to accept the bug
+report), please do not send bug reports via personal email. For
+discussions, join us in the IRC channel #youtube-dl on freenode
+(webchat).
+
+PLEASE INCLUDE THE FULL OUTPUT OF YOUTUBE-DL WHEN RUN WITH -v, i.e. ADD
+-v flag to YOUR COMMAND LINE, copy the WHOLE output and post it in the
+issue body wrapped in ``` for better formatting. It should look similar
+to this:
+
+ $ youtube-dl -v <your command line>
+ [debug] System config: []
+ [debug] User config: []
+ [debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
+ [debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
+ [debug] youtube-dl version 2015.12.06
+ [debug] Git HEAD: 135392e
+ [debug] Python version 2.6.6 - Windows-2003Server-5.2.3790-SP2
+ [debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
+ [debug] Proxy map: {}
+ ...
+
+DO NOT POST SCREENSHOTS OF VERBOSE LOGS; ONLY PLAIN TEXT IS ACCEPTABLE.
+
+The output (including the first lines) contains important debugging
+information. Issues without the full output are often not reproducible
+and therefore do not get solved in short order, if ever.
+
+Please re-read your issue once again to avoid a couple of common
+mistakes (you can and should use this as a checklist):
+
+Is the description of the issue itself sufficient?
+
+We often get issue reports that we cannot really decipher. While in most
+cases we eventually get the required information after asking back
+multiple times, this poses an unnecessary drain on our resources. Many
+contributors, including myself, are also not native speakers, so we may
+misread some parts.
+
+So please elaborate on what feature you are requesting, or what bug you
+want to be fixed. Make sure that it's obvious
+
+- What the problem is
+- How it could be fixed
+- How your proposed solution would look like
+
+If your report is shorter than two lines, it is almost certainly missing
+some of these, which makes it hard for us to respond to it. We're often
+too polite to close the issue outright, but the missing info makes
+misinterpretation likely. As a committer myself, I often get frustrated
+by these issues, since the only possible way for me to move forward on
+them is to ask for clarification over and over.
+
+For bug reports, this means that your report should contain the
+_complete_ output of youtube-dl when called with the -v flag. The error
+message you get for (most) bugs even says so, but you would not believe
+how many of our bug reports do not contain this information.
+
+If your server has multiple IPs or you suspect censorship, adding
+--call-home may be a good idea to get more diagnostics. If the error is
+ERROR: Unable to extract ... and you cannot reproduce it from multiple
+countries, add --dump-pages (warning: this will yield a rather large
+output, redirect it to the file log.txt by adding >log.txt 2>&1 to your
+command-line) or upload the .dump files you get when you add
+--write-pages somewhere.
+
+SITE SUPPORT REQUESTS MUST CONTAIN AN EXAMPLE URL. An example URL is a
+URL you might want to download, like
+http://www.youtube.com/watch?v=BaW_jenozKc. There should be an obvious
+video present. Except under very special circumstances, the main page of
+a video service (e.g. http://www.youtube.com/) is _not_ an example URL.
+
+Are you using the latest version?
+
+Before reporting any issue, type youtube-dl -U. This should report that
+you're up-to-date. About 20% of the reports we receive are already
+fixed, but people are using outdated versions. This goes for feature
+requests as well.
+
+Is the issue already documented?
+
+Make sure that someone has not already opened the issue you're trying to
+open. Search at the top of the window or browse the GitHub Issues of
+this repository. If there is an issue, feel free to write something
+along the lines of "This affects me as well, with version 2015.01.01.
+Here is some more information on the issue: ...". While some issues may
+be old, a new post into them often spurs rapid activity.
+
+Why are existing options not enough?
+
+Before requesting a new feature, please have a quick peek at the list of
+supported options. Many feature requests are for features that actually
+exist already! Please, absolutely do show off your work in the issue
+report and detail how the existing similar options do _not_ solve your
+problem.
+
+Is there enough context in your bug report?
+
+People want to solve problems, and often think they do us a favor by
+breaking down their larger problems (e.g. wanting to skip already
+downloaded files) to a specific request (e.g. requesting us to look
+whether the file exists before downloading the info page). However, what
+often happens is that they break down the problem into two steps: One
+simple, and one impossible (or extremely complicated one).
+
+We are then presented with a very complicated request when the original
+problem could be solved far easier, e.g. by recording the downloaded
+video IDs in a separate file. To avoid this, you must include the
+greater context where it is non-obvious. In particular, every feature
+request that does not consist of adding support for a new site should
+contain a use case scenario that explains in what situation the missing
+feature would be useful.
+
+Does the issue involve one problem, and one problem only?
+
+Some of our users seem to think there is a limit of issues they can or
+should open. There is no limit of issues they can or should open. While
+it may seem appealing to be able to dump all your issues into one
+ticket, that means that someone who solves one of your issues cannot
+mark the issue as closed. Typically, reporting a bunch of issues leads
+to the ticket lingering since nobody wants to attack that behemoth,
+until someone mercifully splits the issue into multiple ones.
+
+In particular, every site support request issue should only pertain to
+services at one site (generally under a common domain, but always using
+the same backend technology). Do not request support for vimeo user
+videos, White house podcasts, and Google Plus pages in the same issue.
+Also, make sure that you don't post bug reports alongside feature
+requests. As a rule of thumb, a feature request does not include outputs
+of youtube-dl that are not immediately related to the feature at hand.
+Do not post reports of a network error alongside the request for a new
+video service.
+
+Is anyone going to need the feature?
+
+Only post features that you (or an incapacitated friend you can
+personally talk to) require. Do not post features because they seem like
+a good idea. If they are really useful, they will be requested by
+someone who requires them.
+
+Is your question about youtube-dl?
+
+It may sound strange, but some bug reports we receive are completely
+unrelated to youtube-dl and relate to a different, or even the
+reporter's own, application. Please make sure that you are actually
+using youtube-dl. If you are using a UI for youtube-dl, report the bug
+to the maintainer of the actual application providing the UI. On the
+other hand, if your UI for youtube-dl fails in some way you believe is
+related to youtube-dl, by all means, go ahead and report the bug.
+
+
+
+COPYRIGHT
+
+
+youtube-dl is released into the public domain by the copyright holders.
+
+This README file was originally written by Daniel Bolton and is likewise
+released into the public domain.
self.send_header('Content-Length', len(msg))
self.end_headers()
self.wfile.write(msg)
- except HTTPError as e:
- self.send_response(e.code, str(e))
else:
self.send_response(500, 'Unknown build method "%s"' % action)
else:
- **AdobeTVVideo**
- **AdultSwim**
- **aenetworks**: A+E Networks: A&E, Lifetime, History.com, FYI Network
- - **AfreecaTV**: afreecatv.com
+ - **afreecatv**: afreecatv.com
+ - **afreecatv:global**: afreecatv.com
- **AirMozilla**
- **AlJazeera**
- **Allocine**
- **awaan:live**
- **awaan:season**
- **awaan:video**
+ - **AZMedien**: AZ Medien videos
+ - **AZMedienPlaylist**: AZ Medien playlists
- **Azubu**
- **AzubuLive**
- **BaiduVideo**: 百度视频
- **bambuser:channel**
- **Bandcamp**
- **Bandcamp:album**
+ - **bangumi.bilibili.com**: BiliBili番剧
- **bbc**: BBC
- **bbc.co.uk**: BBC iPlayer
- **bbc.co.uk:article**: BBC articles
- **bbc.co.uk:iplayer:playlist**
- **bbc.co.uk:playlist**
+ - **Beam:live**
- **Beatport**
- **Beeg**
- **BehindKink**
- **cbsnews**: CBS News
- **cbsnews:livevideo**: CBS News Live Videos
- **CBSSports**
- - **CCTV**
+ - **CCMA**
+ - **CCTV**: 央视网
- **CDA**
- **CeskaTelevize**
- **channel9**: Channel 9
- **Digiteka**
- **Discovery**
- **DiscoveryGo**
+ - **Disney**
- **Dotsub**
- **DouyuTV**: 斗鱼
- **DPlay**
- **DRBonanza**
- **Dropbox**
- **DrTuber**
- - **DRTV**
+ - **drtv**
+ - **drtv:live**
- **Dumpert**
- **dvtv**: http://video.aktualne.cz/
- **dw**
- **EaglePlatform**
- **EbaumsWorld**
- **EchoMsk**
+ - **egghead:course**: egghead.io course
- **eHow**
- **Einthusan**
- **eitb.tv**
- **fc2**
- **fc2:embed**
- **Fczenit**
- - **features.aol.com**
- **fernsehkritik.tv**
+ - **filmon**
+ - **filmon:channel**
- **Firstpost**
- **FiveTV**
- **Flickr**
- **francetvinfo.fr**
- **Freesound**
- **freespeech.org**
- - **FreeVideo**
- **Funimation**
- **FunnyOrDie**
- **Fusion**
- **Gamersyde**
- **GameSpot**
- **GameStar**
+ - **Gaskrank**
- **Gazeta**
- **GDCVault**
- **generic**: Generic downloader that works on some sites
- **history:topic**: History.com Topic
- **hitbox**
- **hitbox:live**
+ - **HitRecord**
- **HornBunny**
- **HotNewHipHop**
- **HotStar**
- **Imgur**
- **ImgurAlbum**
- **Ina**
+ - **Inc**
- **Indavideo**
- **IndavideoEmbed**
- **InfoQ**
- **IPrima**
- **iqiyi**: 爱奇艺
- **Ir90Tv**
+ - **ITV**
- **ivi**: ivi.ru
- **ivi:compilation**: ivi.ru compilations
- **ivideon**: Ivideon TV
- **kuwo:singer**: 酷我音乐 - 歌手
- **kuwo:song**: 酷我音乐
- **la7.it**
- - **Laola1Tv**
+ - **laola1tv**
+ - **laola1tv:embed**
- **LCI**
- **Lcp**
- **LcpPlay**
- **MatchTV**
- **MDR**: MDR.DE and KiKA
- **media.ccc.de**
+ - **Meipai**: 美拍
+ - **MelonVOD**
- **META**
- **metacafe**
- **Metacritic**
- **mtg**: MTG services
- **mtv**
- **mtv.de**
+ - **mtv81**
- **mtv:video**
- **mtvservices:embedded**
- **MuenchenTV**: münchen.tv
- **Newstube**
- **NextMedia**: 蘋果日報
- **NextMediaActionNews**: 蘋果日報 - 動新聞
+ - **NextTV**: 壹電視
- **nfb**: National Film Board of Canada
- **nfl.com**
- **NhkVod**
- **NRKPlaylist**
- **NRKSkole**: NRK Skole
- **NRKTV**: NRK TV and NRK Radio
+ - **NRKTVDirekte**: NRK TV Direkte and NRK Radio Direkte
+ - **NRKTVEpisodes**
+ - **NRKTVSeries**
- **ntv.ru**
- **Nuvid**
- **NYTimes**
- **Odnoklassniki**
- **OktoberfestTV**
- **on.aol.com**
+ - **OnDemandKorea**
- **onet.tv**
- **onet.tv:channel**
- **OnionStudios**
- **PhilharmonieDeParis**: Philharmonie de Paris
- **phoenix.de**
- **Photobucket**
+ - **Piksel**
- **Pinkbike**
- **Pladform**
- **play.fm**
- **PolskieRadio**
- **PolskieRadioCategory**
- **PornCom**
+ - **PornFlip**
- **PornHd**
- **PornHub**: PornHub and Thumbzilla
- **PornHubPlaylist**
- **screen.yahoo:search**: Yahoo screen search
- **Screencast**
- **ScreencastOMatic**
- - **ScreenJunkies**
- **Seeker**
- **SenateISVP**
- **SendtoNews**
- **Sexu**
- **Shahid**
- **Shared**: shared.sx
- - **ShareSix**
+ - **ShowRoomLive**
- **Sina**
- **SixPlay**
- **skynewsarabia:article**
- **Spiegeltv**
- **Spike**
- **Sport5**
- - **SportBox**
- **SportBoxEmbed**
- **SportDeutschland**
- **Sportschau**
- **TV2Article**
- **TV3**
- **TV4**: tv4.se and tv4play.se
+ - **TVA**
- **TVANouvelles**
- **TVANouvellesArticle**
- **TVC**
- **Tweakers**
- **twitch:chapter**
- **twitch:clips**
- - **twitch:past_broadcasts**
- **twitch:profile**
- **twitch:stream**
- **twitch:video**
+ - **twitch:videos:all**
+ - **twitch:videos:highlights**
+ - **twitch:videos:past-broadcasts**
+ - **twitch:videos:uploads**
- **twitch:vod**
- **twitter**
- **twitter:amplify**
- **udemy**
- **udemy:course**
- **UDNEmbed**: 聯合影音
+ - **UKTVPlay**
- **Unistra**
- **uol.com.br**
- **uplynk**
- **ViceShow**
- **Vidbit**
- **Viddler**
+ - **Videa**
- **video.google:search**: Google Video search
- **video.mit.edu**
- **VideoDetective**
- **videomore:season**
- **videomore:video**
- **VideoPremium**
- - **VideoTt**: video.tt - Your True Tube (Currently broken)
+ - **VideoPress**
- **videoweed**: VideoWeed
- **Vidio**
- **vidme**
- **Vimple**: Vimple - one-click video hosting
- **Vine**
- **vine:user**
+ - **Viu**
+ - **viu:ott**
+ - **viu:playlist**
- **Vivo**: vivo.sx
- **vk**: VK
- **vk:uservideos**: VK - User's Videos
- **vk:wallpost**
- **vlive**
+ - **vlive:channel**
- **Vodlocker**
- **VODPlatform**
- **VoiceRepublic**
- **VRT**
- **vube**: Vube.com
- **VuClip**
+ - **VVVVID**
- **VyboryMos**
- **Vzaar**
- **Walla**
+++ /dev/null
-[wheel]
-universal = True
-
-[flake8]
-exclude = youtube_dl/extractor/__init__.py,devscripts/buildserver.py,devscripts/lazy_load_template.py,devscripts/make_issue_template.py,setup.py,build,.git
-ignore = E402,E501,E731
lowercase_escape,
url_basename,
base_url,
+ urljoin,
urlencode_postdata,
urshift,
update_url_query,
self.assertEqual(unified_strdate('27.02.2016 17:30'), '20160227')
self.assertEqual(unified_strdate('UNKNOWN DATE FORMAT'), None)
self.assertEqual(unified_strdate('Feb 7, 2016 at 6:35 pm'), '20160207')
+ self.assertEqual(unified_strdate('July 15th, 2013'), '20130715')
+ self.assertEqual(unified_strdate('September 1st, 2013'), '20130901')
+ self.assertEqual(unified_strdate('Sep 2nd, 2013'), '20130902')
def test_unified_timestamps(self):
self.assertEqual(unified_timestamp('December 21, 2010'), 1292889600)
self.assertEqual(base_url('http://foo.de/bar/baz'), 'http://foo.de/bar/')
self.assertEqual(base_url('http://foo.de/bar/baz?x=z/x/c'), 'http://foo.de/bar/')
+ def test_urljoin(self):
+ self.assertEqual(urljoin('http://foo.de/', '/a/b/c.txt'), 'http://foo.de/a/b/c.txt')
+ self.assertEqual(urljoin('//foo.de/', '/a/b/c.txt'), '//foo.de/a/b/c.txt')
+ self.assertEqual(urljoin('http://foo.de/', 'a/b/c.txt'), 'http://foo.de/a/b/c.txt')
+ self.assertEqual(urljoin('http://foo.de', '/a/b/c.txt'), 'http://foo.de/a/b/c.txt')
+ self.assertEqual(urljoin('http://foo.de', 'a/b/c.txt'), 'http://foo.de/a/b/c.txt')
+ self.assertEqual(urljoin('http://foo.de/', 'http://foo.de/a/b/c.txt'), 'http://foo.de/a/b/c.txt')
+ self.assertEqual(urljoin('http://foo.de/', '//foo.de/a/b/c.txt'), '//foo.de/a/b/c.txt')
+ self.assertEqual(urljoin(None, 'http://foo.de/a/b/c.txt'), 'http://foo.de/a/b/c.txt')
+ self.assertEqual(urljoin(None, '//foo.de/a/b/c.txt'), '//foo.de/a/b/c.txt')
+ self.assertEqual(urljoin('', 'http://foo.de/a/b/c.txt'), 'http://foo.de/a/b/c.txt')
+ self.assertEqual(urljoin(['foobar'], 'http://foo.de/a/b/c.txt'), 'http://foo.de/a/b/c.txt')
+ self.assertEqual(urljoin('http://foo.de/', None), None)
+ self.assertEqual(urljoin('http://foo.de/', ''), None)
+ self.assertEqual(urljoin('http://foo.de/', ['foobar']), None)
+ self.assertEqual(urljoin('http://foo.de/a/b/c.txt', '.././../d.txt'), 'http://foo.de/d.txt')
+
def test_parse_age_limit(self):
self.assertEqual(parse_age_limit(None), None)
self.assertEqual(parse_age_limit(False), None)
self.assertEqual(parse_duration('1 hour 3 minutes'), 3780)
self.assertEqual(parse_duration('87 Min.'), 5220)
self.assertEqual(parse_duration('PT1H0.040S'), 3600.04)
+ self.assertEqual(parse_duration('PT00H03M30SZ'), 210)
def test_fix_xml_ampersands(self):
self.assertEqual(
on = js_to_json('["abc", "def",]')
self.assertEqual(json.loads(on), ['abc', 'def'])
+ on = js_to_json('[/*comment\n*/"abc"/*comment\n*/,/*comment\n*/"def",/*comment\n*/]')
+ self.assertEqual(json.loads(on), ['abc', 'def'])
+
+ on = js_to_json('[//comment\n"abc" //comment\n,//comment\n"def",//comment\n]')
+ self.assertEqual(json.loads(on), ['abc', 'def'])
+
on = js_to_json('{"abc": "def",}')
self.assertEqual(json.loads(on), {'abc': 'def'})
+ on = js_to_json('{/*comment\n*/"abc"/*comment\n*/:/*comment\n*/"def"/*comment\n*/,/*comment\n*/}')
+ self.assertEqual(json.loads(on), {'abc': 'def'})
+
on = js_to_json('{ 0: /* " \n */ ",]" , }')
self.assertEqual(json.loads(on), {'0': ',]'})
+ on = js_to_json('{ /*comment\n*/0/*comment\n*/: /* " \n */ ",]" , }')
+ self.assertEqual(json.loads(on), {'0': ',]'})
+
+ on = js_to_json('{ 0: // comment\n1 }')
+ self.assertEqual(json.loads(on), {'0': 1})
+
on = js_to_json(r'["<p>x<\/p>"]')
self.assertEqual(json.loads(on), ['<p>x</p>'])
on = js_to_json("['a\\\nb']")
self.assertEqual(json.loads(on), ['ab'])
+ on = js_to_json("/*comment\n*/[/*comment\n*/'a\\\nb'/*comment\n*/]/*comment\n*/")
+ self.assertEqual(json.loads(on), ['ab'])
+
on = js_to_json('{0xff:0xff}')
self.assertEqual(json.loads(on), {'255': 255})
+ on = js_to_json('{/*comment\n*/0xff/*comment\n*/:/*comment\n*/0xff/*comment\n*/}')
+ self.assertEqual(json.loads(on), {'255': 255})
+
on = js_to_json('{077:077}')
self.assertEqual(json.loads(on), {'63': 63})
+ on = js_to_json('{/*comment\n*/077/*comment\n*/:/*comment\n*/077/*comment\n*/}')
+ self.assertEqual(json.loads(on), {'63': 63})
+
on = js_to_json('{42:42}')
self.assertEqual(json.loads(on), {'42': 42})
+ on = js_to_json('{/*comment\n*/42/*comment\n*/:/*comment\n*/42/*comment\n*/}')
+ self.assertEqual(json.loads(on), {'42': 42})
+
def test_extract_attributes(self):
self.assertEqual(extract_attributes('<e x="y">'), {'x': 'y'})
self.assertEqual(extract_attributes("<e x='y'>"), {'x': 'y'})
+++ /dev/null
-[tox]
-envlist = py26,py27,py33,py34,py35
-[testenv]
-deps =
- nose
- coverage
-# We need a valid $HOME for test_compat_expanduser
-passenv = HOME
-defaultargs = test --exclude test_download.py --exclude test_age_restriction.py
- --exclude test_subtitles.py --exclude test_write_annotations.py
- --exclude test_youtube_lists.py --exclude test_iqiyi_sdk_interpreter.py
- --exclude test_socks.py
-commands = nosetests --verbose {posargs:{[testenv]defaultargs}} # --with-coverage --cover-package=youtube_dl --cover-html
- # test.test_download:TestDownload.test_NowVideo
--- /dev/null
+.TH "YOUTUBE\-DL" "1" "" "" ""
+.SH NAME
+.PP
+youtube\-dl \- download videos from youtube.com or other video platforms
+.SH SYNOPSIS
+.PP
+\f[B]youtube\-dl\f[] [OPTIONS] URL [URL...]
+.SH DESCRIPTION
+.PP
+\f[B]youtube\-dl\f[] is a command\-line program to download videos from
+YouTube.com and a few more sites.
+It requires the Python interpreter, version 2.6, 2.7, or 3.2+, and it is
+not platform specific.
+It should work on your Unix box, on Windows or on Mac OS X.
+It is released to the public domain, which means you can modify it,
+redistribute it or use it however you like.
+.SH OPTIONS
+.TP
+.B \-h, \-\-help
+Print this help text and exit
+.RS
+.RE
+.TP
+.B \-\-version
+Print program version and exit
+.RS
+.RE
+.TP
+.B \-U, \-\-update
+Update this program to latest version.
+Make sure that you have sufficient permissions (run with sudo if needed)
+.RS
+.RE
+.TP
+.B \-i, \-\-ignore\-errors
+Continue on download errors, for example to skip unavailable videos in a
+playlist
+.RS
+.RE
+.TP
+.B \-\-abort\-on\-error
+Abort downloading of further videos (in the playlist or the command
+line) if an error occurs
+.RS
+.RE
+.TP
+.B \-\-dump\-user\-agent
+Display the current browser identification
+.RS
+.RE
+.TP
+.B \-\-list\-extractors
+List all supported extractors
+.RS
+.RE
+.TP
+.B \-\-extractor\-descriptions
+Output descriptions of all supported extractors
+.RS
+.RE
+.TP
+.B \-\-force\-generic\-extractor
+Force extraction to use the generic extractor
+.RS
+.RE
+.TP
+.B \-\-default\-search \f[I]PREFIX\f[]
+Use this prefix for unqualified URLs.
+For example "gvsearch2:" downloads two videos from google videos for
+youtube\-dl "large apple".
+Use the value "auto" to let youtube\-dl guess ("auto_warning" to emit a
+warning when guessing).
+"error" just throws an error.
+The default value "fixup_error" repairs broken URLs, but emits an error
+if this is not possible instead of searching.
+.RS
+.RE
+.TP
+.B \-\-ignore\-config
+Do not read configuration files.
+When given in the global configuration file /etc/youtube\-dl.conf: Do
+not read the user configuration in ~/.config/youtube\- dl/config
+(%APPDATA%/youtube\-dl/config.txt on Windows)
+.RS
+.RE
+.TP
+.B \-\-config\-location \f[I]PATH\f[]
+Location of the configuration file; either the path to the config or its
+containing directory.
+.RS
+.RE
+.TP
+.B \-\-flat\-playlist
+Do not extract the videos of a playlist, only list them.
+.RS
+.RE
+.TP
+.B \-\-mark\-watched
+Mark videos watched (YouTube only)
+.RS
+.RE
+.TP
+.B \-\-no\-mark\-watched
+Do not mark videos watched (YouTube only)
+.RS
+.RE
+.TP
+.B \-\-no\-color
+Do not emit color codes in output
+.RS
+.RE
+.SS Network Options:
+.TP
+.B \-\-proxy \f[I]URL\f[]
+Use the specified HTTP/HTTPS/SOCKS proxy.
+To enable experimental SOCKS proxy, specify a proper scheme.
+For example socks5://127.0.0.1:1080/.
+Pass in an empty string (\-\-proxy "") for direct connection
+.RS
+.RE
+.TP
+.B \-\-socket\-timeout \f[I]SECONDS\f[]
+Time to wait before giving up, in seconds
+.RS
+.RE
+.TP
+.B \-\-source\-address \f[I]IP\f[]
+Client\-side IP address to bind to
+.RS
+.RE
+.TP
+.B \-4, \-\-force\-ipv4
+Make all connections via IPv4
+.RS
+.RE
+.TP
+.B \-6, \-\-force\-ipv6
+Make all connections via IPv6
+.RS
+.RE
+.TP
+.B \-\-geo\-verification\-proxy \f[I]URL\f[]
+Use this proxy to verify the IP address for some geo\-restricted sites.
+The default proxy specified by \-\-proxy (or none, if the options is not
+present) is used for the actual downloading.
+.RS
+.RE
+.SS Video Selection:
+.TP
+.B \-\-playlist\-start \f[I]NUMBER\f[]
+Playlist video to start at (default is 1)
+.RS
+.RE
+.TP
+.B \-\-playlist\-end \f[I]NUMBER\f[]
+Playlist video to end at (default is last)
+.RS
+.RE
+.TP
+.B \-\-playlist\-items \f[I]ITEM_SPEC\f[]
+Playlist video items to download.
+Specify indices of the videos in the playlist separated by commas like:
+"\-\-playlist\-items 1,2,5,8" if you want to download videos indexed 1,
+2, 5, 8 in the playlist.
+You can specify range: "\-\-playlist\-items 1\-3,7,10\-13", it will
+download the videos at index 1, 2, 3, 7, 10, 11, 12 and 13.
+.RS
+.RE
+.TP
+.B \-\-match\-title \f[I]REGEX\f[]
+Download only matching titles (regex or caseless sub\-string)
+.RS
+.RE
+.TP
+.B \-\-reject\-title \f[I]REGEX\f[]
+Skip download for matching titles (regex or caseless sub\-string)
+.RS
+.RE
+.TP
+.B \-\-max\-downloads \f[I]NUMBER\f[]
+Abort after downloading NUMBER files
+.RS
+.RE
+.TP
+.B \-\-min\-filesize \f[I]SIZE\f[]
+Do not download any videos smaller than SIZE (e.g.
+50k or 44.6m)
+.RS
+.RE
+.TP
+.B \-\-max\-filesize \f[I]SIZE\f[]
+Do not download any videos larger than SIZE (e.g.
+50k or 44.6m)
+.RS
+.RE
+.TP
+.B \-\-date \f[I]DATE\f[]
+Download only videos uploaded in this date
+.RS
+.RE
+.TP
+.B \-\-datebefore \f[I]DATE\f[]
+Download only videos uploaded on or before this date (i.e.
+inclusive)
+.RS
+.RE
+.TP
+.B \-\-dateafter \f[I]DATE\f[]
+Download only videos uploaded on or after this date (i.e.
+inclusive)
+.RS
+.RE
+.TP
+.B \-\-min\-views \f[I]COUNT\f[]
+Do not download any videos with less than COUNT views
+.RS
+.RE
+.TP
+.B \-\-max\-views \f[I]COUNT\f[]
+Do not download any videos with more than COUNT views
+.RS
+.RE
+.TP
+.B \-\-match\-filter \f[I]FILTER\f[]
+Generic video filter.
+Specify any key (see help for \-o for a list of available keys) to match
+if the key is present, !key to check if the key is not present,key >
+NUMBER (like "comment_count > 12", also works with >=, <, <=, !=, =) to
+compare against a number, and & to require multiple matches.
+Values which are not known are excluded unless you put a question mark
+(?) after the operator.For example, to only match videos that have been
+liked more than 100 times and disliked less than 50 times (or the
+dislike functionality is not available at the given service), but who
+also have a description, use \-\-match\-filter "like_count > 100 &
+dislike_count <?
+50 & description" .
+.RS
+.RE
+.TP
+.B \-\-no\-playlist
+Download only the video, if the URL refers to a video and a playlist.
+.RS
+.RE
+.TP
+.B \-\-yes\-playlist
+Download the playlist, if the URL refers to a video and a playlist.
+.RS
+.RE
+.TP
+.B \-\-age\-limit \f[I]YEARS\f[]
+Download only videos suitable for the given age
+.RS
+.RE
+.TP
+.B \-\-download\-archive \f[I]FILE\f[]
+Download only videos not listed in the archive file.
+Record the IDs of all downloaded videos in it.
+.RS
+.RE
+.TP
+.B \-\-include\-ads
+Download advertisements as well (experimental)
+.RS
+.RE
+.SS Download Options:
+.TP
+.B \-r, \-\-limit\-rate \f[I]RATE\f[]
+Maximum download rate in bytes per second (e.g.
+50K or 4.2M)
+.RS
+.RE
+.TP
+.B \-R, \-\-retries \f[I]RETRIES\f[]
+Number of retries (default is 10), or "infinite".
+.RS
+.RE
+.TP
+.B \-\-fragment\-retries \f[I]RETRIES\f[]
+Number of retries for a fragment (default is 10), or "infinite" (DASH
+and hlsnative only)
+.RS
+.RE
+.TP
+.B \-\-skip\-unavailable\-fragments
+Skip unavailable fragments (DASH and hlsnative only)
+.RS
+.RE
+.TP
+.B \-\-abort\-on\-unavailable\-fragment
+Abort downloading when some fragment is not available
+.RS
+.RE
+.TP
+.B \-\-buffer\-size \f[I]SIZE\f[]
+Size of download buffer (e.g.
+1024 or 16K) (default is 1024)
+.RS
+.RE
+.TP
+.B \-\-no\-resize\-buffer
+Do not automatically adjust the buffer size.
+By default, the buffer size is automatically resized from an initial
+value of SIZE.
+.RS
+.RE
+.TP
+.B \-\-playlist\-reverse
+Download playlist videos in reverse order
+.RS
+.RE
+.TP
+.B \-\-playlist\-random
+Download playlist videos in random order
+.RS
+.RE
+.TP
+.B \-\-xattr\-set\-filesize
+Set file xattribute ytdl.filesize with expected file size (experimental)
+.RS
+.RE
+.TP
+.B \-\-hls\-prefer\-native
+Use the native HLS downloader instead of ffmpeg
+.RS
+.RE
+.TP
+.B \-\-hls\-prefer\-ffmpeg
+Use ffmpeg instead of the native HLS downloader
+.RS
+.RE
+.TP
+.B \-\-hls\-use\-mpegts
+Use the mpegts container for HLS videos, allowing to play the video
+while downloading (some players may not be able to play it)
+.RS
+.RE
+.TP
+.B \-\-external\-downloader \f[I]COMMAND\f[]
+Use the specified external downloader.
+Currently supports aria2c,avconv,axel,curl,ffmpeg,httpie,wget
+.RS
+.RE
+.TP
+.B \-\-external\-downloader\-args \f[I]ARGS\f[]
+Give these arguments to the external downloader
+.RS
+.RE
+.SS Filesystem Options:
+.TP
+.B \-a, \-\-batch\-file \f[I]FILE\f[]
+File containing URLs to download (\[aq]\-\[aq] for stdin)
+.RS
+.RE
+.TP
+.B \-\-id
+Use only video ID in file name
+.RS
+.RE
+.TP
+.B \-o, \-\-output \f[I]TEMPLATE\f[]
+Output filename template, see the "OUTPUT TEMPLATE" for all the info
+.RS
+.RE
+.TP
+.B \-\-autonumber\-size \f[I]NUMBER\f[]
+Specify the number of digits in %(autonumber)s when it is present in
+output filename template or \-\-auto\-number option is given (default is
+5)
+.RS
+.RE
+.TP
+.B \-\-autonumber\-start \f[I]NUMBER\f[]
+Specify the start value for %(autonumber)s (default is 1)
+.RS
+.RE
+.TP
+.B \-\-restrict\-filenames
+Restrict filenames to only ASCII characters, and avoid "&" and spaces in
+filenames
+.RS
+.RE
+.TP
+.B \-A, \-\-auto\-number
+[deprecated; use \-o "%(autonumber)s\-%(title)s.%(ext)s" ] Number
+downloaded files starting from 00000
+.RS
+.RE
+.TP
+.B \-t, \-\-title
+[deprecated] Use title in file name (default)
+.RS
+.RE
+.TP
+.B \-l, \-\-literal
+[deprecated] Alias of \-\-title
+.RS
+.RE
+.TP
+.B \-w, \-\-no\-overwrites
+Do not overwrite files
+.RS
+.RE
+.TP
+.B \-c, \-\-continue
+Force resume of partially downloaded files.
+By default, youtube\-dl will resume downloads if possible.
+.RS
+.RE
+.TP
+.B \-\-no\-continue
+Do not resume partially downloaded files (restart from beginning)
+.RS
+.RE
+.TP
+.B \-\-no\-part
+Do not use .part files \- write directly into output file
+.RS
+.RE
+.TP
+.B \-\-no\-mtime
+Do not use the Last\-modified header to set the file modification time
+.RS
+.RE
+.TP
+.B \-\-write\-description
+Write video description to a .description file
+.RS
+.RE
+.TP
+.B \-\-write\-info\-json
+Write video metadata to a .info.json file
+.RS
+.RE
+.TP
+.B \-\-write\-annotations
+Write video annotations to a .annotations.xml file
+.RS
+.RE
+.TP
+.B \-\-load\-info\-json \f[I]FILE\f[]
+JSON file containing the video information (created with the
+"\-\-write\-info\-json" option)
+.RS
+.RE
+.TP
+.B \-\-cookies \f[I]FILE\f[]
+File to read cookies from and dump cookie jar in
+.RS
+.RE
+.TP
+.B \-\-cache\-dir \f[I]DIR\f[]
+Location in the filesystem where youtube\-dl can store some downloaded
+information permanently.
+By default $XDG_CACHE_HOME/youtube\-dl or ~/.cache/youtube\-dl .
+At the moment, only YouTube player files (for videos with obfuscated
+signatures) are cached, but that may change.
+.RS
+.RE
+.TP
+.B \-\-no\-cache\-dir
+Disable filesystem caching
+.RS
+.RE
+.TP
+.B \-\-rm\-cache\-dir
+Delete all filesystem cache files
+.RS
+.RE
+.SS Thumbnail images:
+.TP
+.B \-\-write\-thumbnail
+Write thumbnail image to disk
+.RS
+.RE
+.TP
+.B \-\-write\-all\-thumbnails
+Write all thumbnail image formats to disk
+.RS
+.RE
+.TP
+.B \-\-list\-thumbnails
+Simulate and list all available thumbnail formats
+.RS
+.RE
+.SS Verbosity / Simulation Options:
+.TP
+.B \-q, \-\-quiet
+Activate quiet mode
+.RS
+.RE
+.TP
+.B \-\-no\-warnings
+Ignore warnings
+.RS
+.RE
+.TP
+.B \-s, \-\-simulate
+Do not download the video and do not write anything to disk
+.RS
+.RE
+.TP
+.B \-\-skip\-download
+Do not download the video
+.RS
+.RE
+.TP
+.B \-g, \-\-get\-url
+Simulate, quiet but print URL
+.RS
+.RE
+.TP
+.B \-e, \-\-get\-title
+Simulate, quiet but print title
+.RS
+.RE
+.TP
+.B \-\-get\-id
+Simulate, quiet but print id
+.RS
+.RE
+.TP
+.B \-\-get\-thumbnail
+Simulate, quiet but print thumbnail URL
+.RS
+.RE
+.TP
+.B \-\-get\-description
+Simulate, quiet but print video description
+.RS
+.RE
+.TP
+.B \-\-get\-duration
+Simulate, quiet but print video length
+.RS
+.RE
+.TP
+.B \-\-get\-filename
+Simulate, quiet but print output filename
+.RS
+.RE
+.TP
+.B \-\-get\-format
+Simulate, quiet but print output format
+.RS
+.RE
+.TP
+.B \-j, \-\-dump\-json
+Simulate, quiet but print JSON information.
+See \-\-output for a description of available keys.
+.RS
+.RE
+.TP
+.B \-J, \-\-dump\-single\-json
+Simulate, quiet but print JSON information for each command\-line
+argument.
+If the URL refers to a playlist, dump the whole playlist information in
+a single line.
+.RS
+.RE
+.TP
+.B \-\-print\-json
+Be quiet and print the video information as JSON (video is still being
+downloaded).
+.RS
+.RE
+.TP
+.B \-\-newline
+Output progress bar as new lines
+.RS
+.RE
+.TP
+.B \-\-no\-progress
+Do not print progress bar
+.RS
+.RE
+.TP
+.B \-\-console\-title
+Display progress in console titlebar
+.RS
+.RE
+.TP
+.B \-v, \-\-verbose
+Print various debugging information
+.RS
+.RE
+.TP
+.B \-\-dump\-pages
+Print downloaded pages encoded using base64 to debug problems (very
+verbose)
+.RS
+.RE
+.TP
+.B \-\-write\-pages
+Write downloaded intermediary pages to files in the current directory to
+debug problems
+.RS
+.RE
+.TP
+.B \-\-print\-traffic
+Display sent and read HTTP traffic
+.RS
+.RE
+.TP
+.B \-C, \-\-call\-home
+Contact the youtube\-dl server for debugging
+.RS
+.RE
+.TP
+.B \-\-no\-call\-home
+Do NOT contact the youtube\-dl server for debugging
+.RS
+.RE
+.SS Workarounds:
+.TP
+.B \-\-encoding \f[I]ENCODING\f[]
+Force the specified encoding (experimental)
+.RS
+.RE
+.TP
+.B \-\-no\-check\-certificate
+Suppress HTTPS certificate validation
+.RS
+.RE
+.TP
+.B \-\-prefer\-insecure
+Use an unencrypted connection to retrieve information about the video.
+(Currently supported only for YouTube)
+.RS
+.RE
+.TP
+.B \-\-user\-agent \f[I]UA\f[]
+Specify a custom user agent
+.RS
+.RE
+.TP
+.B \-\-referer \f[I]URL\f[]
+Specify a custom referer, use if the video access is restricted to one
+domain
+.RS
+.RE
+.TP
+.B \-\-add\-header \f[I]FIELD:VALUE\f[]
+Specify a custom HTTP header and its value, separated by a colon
+\[aq]:\[aq].
+You can use this option multiple times
+.RS
+.RE
+.TP
+.B \-\-bidi\-workaround
+Work around terminals that lack bidirectional text support.
+Requires bidiv or fribidi executable in PATH
+.RS
+.RE
+.TP
+.B \-\-sleep\-interval \f[I]SECONDS\f[]
+Number of seconds to sleep before each download when used alone or a
+lower bound of a range for randomized sleep before each download
+(minimum possible number of seconds to sleep) when used along with
+\-\-max\-sleep\-interval.
+.RS
+.RE
+.TP
+.B \-\-max\-sleep\-interval \f[I]SECONDS\f[]
+Upper bound of a range for randomized sleep before each download
+(maximum possible number of seconds to sleep).
+Must only be used along with \-\-min\-sleep\-interval.
+.RS
+.RE
+.SS Video Format Options:
+.TP
+.B \-f, \-\-format \f[I]FORMAT\f[]
+Video format code, see the "FORMAT SELECTION" for all the info
+.RS
+.RE
+.TP
+.B \-\-all\-formats
+Download all available video formats
+.RS
+.RE
+.TP
+.B \-\-prefer\-free\-formats
+Prefer free video formats unless a specific one is requested
+.RS
+.RE
+.TP
+.B \-F, \-\-list\-formats
+List all available formats of requested videos
+.RS
+.RE
+.TP
+.B \-\-youtube\-skip\-dash\-manifest
+Do not download the DASH manifests and related data on YouTube videos
+.RS
+.RE
+.TP
+.B \-\-merge\-output\-format \f[I]FORMAT\f[]
+If a merge is required (e.g.
+bestvideo+bestaudio), output to given container format.
+One of mkv, mp4, ogg, webm, flv.
+Ignored if no merge is required
+.RS
+.RE
+.SS Subtitle Options:
+.TP
+.B \-\-write\-sub
+Write subtitle file
+.RS
+.RE
+.TP
+.B \-\-write\-auto\-sub
+Write automatically generated subtitle file (YouTube only)
+.RS
+.RE
+.TP
+.B \-\-all\-subs
+Download all the available subtitles of the video
+.RS
+.RE
+.TP
+.B \-\-list\-subs
+List all available subtitles for the video
+.RS
+.RE
+.TP
+.B \-\-sub\-format \f[I]FORMAT\f[]
+Subtitle format, accepts formats preference, for example: "srt" or
+"ass/srt/best"
+.RS
+.RE
+.TP
+.B \-\-sub\-lang \f[I]LANGS\f[]
+Languages of the subtitles to download (optional) separated by commas,
+use \-\-list\- subs for available language tags
+.RS
+.RE
+.SS Authentication Options:
+.TP
+.B \-u, \-\-username \f[I]USERNAME\f[]
+Login with this account ID
+.RS
+.RE
+.TP
+.B \-p, \-\-password \f[I]PASSWORD\f[]
+Account password.
+If this option is left out, youtube\-dl will ask interactively.
+.RS
+.RE
+.TP
+.B \-2, \-\-twofactor \f[I]TWOFACTOR\f[]
+Two\-factor authentication code
+.RS
+.RE
+.TP
+.B \-n, \-\-netrc
+Use .netrc authentication data
+.RS
+.RE
+.TP
+.B \-\-video\-password \f[I]PASSWORD\f[]
+Video password (vimeo, smotri, youku)
+.RS
+.RE
+.SS Adobe Pass Options:
+.TP
+.B \-\-ap\-mso \f[I]MSO\f[]
+Adobe Pass multiple\-system operator (TV provider) identifier, use
+\-\-ap\-list\-mso for a list of available MSOs
+.RS
+.RE
+.TP
+.B \-\-ap\-username \f[I]USERNAME\f[]
+Multiple\-system operator account login
+.RS
+.RE
+.TP
+.B \-\-ap\-password \f[I]PASSWORD\f[]
+Multiple\-system operator account password.
+If this option is left out, youtube\-dl will ask interactively.
+.RS
+.RE
+.TP
+.B \-\-ap\-list\-mso
+List all supported multiple\-system operators
+.RS
+.RE
+.SS Post\-processing Options:
+.TP
+.B \-x, \-\-extract\-audio
+Convert video files to audio\-only files (requires ffmpeg or avconv and
+ffprobe or avprobe)
+.RS
+.RE
+.TP
+.B \-\-audio\-format \f[I]FORMAT\f[]
+Specify audio format: "best", "aac", "vorbis", "mp3", "m4a", "opus", or
+"wav"; "best" by default; No effect without \-x
+.RS
+.RE
+.TP
+.B \-\-audio\-quality \f[I]QUALITY\f[]
+Specify ffmpeg/avconv audio quality, insert a value between 0 (better)
+and 9 (worse) for VBR or a specific bitrate like 128K (default 5)
+.RS
+.RE
+.TP
+.B \-\-recode\-video \f[I]FORMAT\f[]
+Encode the video to another format if necessary (currently supported:
+mp4|flv|ogg|webm|mkv|avi)
+.RS
+.RE
+.TP
+.B \-\-postprocessor\-args \f[I]ARGS\f[]
+Give these arguments to the postprocessor
+.RS
+.RE
+.TP
+.B \-k, \-\-keep\-video
+Keep the video file on disk after the post\- processing; the video is
+erased by default
+.RS
+.RE
+.TP
+.B \-\-no\-post\-overwrites
+Do not overwrite post\-processed files; the post\-processed files are
+overwritten by default
+.RS
+.RE
+.TP
+.B \-\-embed\-subs
+Embed subtitles in the video (only for mp4, webm and mkv videos)
+.RS
+.RE
+.TP
+.B \-\-embed\-thumbnail
+Embed thumbnail in the audio as cover art
+.RS
+.RE
+.TP
+.B \-\-add\-metadata
+Write metadata to the video file
+.RS
+.RE
+.TP
+.B \-\-metadata\-from\-title \f[I]FORMAT\f[]
+Parse additional metadata like song title / artist from the video title.
+The format syntax is the same as \-\-output, the parsed parameters
+replace existing values.
+Additional templates: %(album)s, %(artist)s.
+Example: \-\-metadata\-from\-title "%(artist)s \- %(title)s" matches a
+title like "Coldplay \- Paradise"
+.RS
+.RE
+.TP
+.B \-\-xattrs
+Write metadata to the video file\[aq]s xattrs (using dublin core and xdg
+standards)
+.RS
+.RE
+.TP
+.B \-\-fixup \f[I]POLICY\f[]
+Automatically correct known faults of the file.
+One of never (do nothing), warn (only emit a warning), detect_or_warn
+(the default; fix file if we can, warn otherwise)
+.RS
+.RE
+.TP
+.B \-\-prefer\-avconv
+Prefer avconv over ffmpeg for running the postprocessors (default)
+.RS
+.RE
+.TP
+.B \-\-prefer\-ffmpeg
+Prefer ffmpeg over avconv for running the postprocessors
+.RS
+.RE
+.TP
+.B \-\-ffmpeg\-location \f[I]PATH\f[]
+Location of the ffmpeg/avconv binary; either the path to the binary or
+its containing directory.
+.RS
+.RE
+.TP
+.B \-\-exec \f[I]CMD\f[]
+Execute a command on the file after downloading, similar to find\[aq]s
+\-exec syntax.
+Example: \-\-exec \[aq]adb push {} /sdcard/Music/ && rm {}\[aq]
+.RS
+.RE
+.TP
+.B \-\-convert\-subs \f[I]FORMAT\f[]
+Convert the subtitles to other format (currently supported: srt|ass|vtt)
+.RS
+.RE
+.SH CONFIGURATION
+.PP
+You can configure youtube\-dl by placing any supported command line
+option to a configuration file.
+On Linux and OS X, the system wide configuration file is located at
+\f[C]/etc/youtube\-dl.conf\f[] and the user wide configuration file at
+\f[C]~/.config/youtube\-dl/config\f[].
+On Windows, the user wide configuration file locations are
+\f[C]%APPDATA%\\youtube\-dl\\config.txt\f[] or
+\f[C]C:\\Users\\<user\ name>\\youtube\-dl.conf\f[].
+Note that by default configuration file may not exist so you may need to
+create it yourself.
+.PP
+For example, with the following configuration file youtube\-dl will
+always extract the audio, not copy the mtime, use a proxy and save all
+videos under \f[C]Movies\f[] directory in your home directory:
+.IP
+.nf
+\f[C]
+#\ Lines\ starting\ with\ #\ are\ comments
+
+#\ Always\ extract\ audio
+\-x
+
+#\ Do\ not\ copy\ the\ mtime
+\-\-no\-mtime
+
+#\ Use\ this\ proxy
+\-\-proxy\ 127.0.0.1:3128
+
+#\ Save\ all\ videos\ under\ Movies\ directory\ in\ your\ home\ directory
+\-o\ ~/Movies/%(title)s.%(ext)s
+\f[]
+.fi
+.PP
+Note that options in configuration file are just the same options aka
+switches used in regular command line calls thus there \f[B]must be no
+whitespace\f[] after \f[C]\-\f[] or \f[C]\-\-\f[], e.g.
+\f[C]\-o\f[] or \f[C]\-\-proxy\f[] but not \f[C]\-\ o\f[] or
+\f[C]\-\-\ proxy\f[].
+.PP
+You can use \f[C]\-\-ignore\-config\f[] if you want to disable the
+configuration file for a particular youtube\-dl run.
+.PP
+You can also use \f[C]\-\-config\-location\f[] if you want to use custom
+configuration file for a particular youtube\-dl run.
+.SS Authentication with \f[C]\&.netrc\f[] file
+.PP
+You may also want to configure automatic credentials storage for
+extractors that support authentication (by providing login and password
+with \f[C]\-\-username\f[] and \f[C]\-\-password\f[]) in order not to
+pass credentials as command line arguments on every youtube\-dl
+execution and prevent tracking plain text passwords in the shell command
+history.
+You can achieve this using a \f[C]\&.netrc\f[]
+file (http://stackoverflow.com/tags/.netrc/info) on a per extractor
+basis.
+For that you will need to create a \f[C]\&.netrc\f[] file in your
+\f[C]$HOME\f[] and restrict permissions to read/write by only you:
+.IP
+.nf
+\f[C]
+touch\ $HOME/.netrc
+chmod\ a\-rwx,u+rw\ $HOME/.netrc
+\f[]
+.fi
+.PP
+After that you can add credentials for an extractor in the following
+format, where \f[I]extractor\f[] is the name of the extractor in
+lowercase:
+.IP
+.nf
+\f[C]
+machine\ <extractor>\ login\ <login>\ password\ <password>
+\f[]
+.fi
+.PP
+For example:
+.IP
+.nf
+\f[C]
+machine\ youtube\ login\ myaccount\@gmail.com\ password\ my_youtube_password
+machine\ twitch\ login\ my_twitch_account_name\ password\ my_twitch_password
+\f[]
+.fi
+.PP
+To activate authentication with the \f[C]\&.netrc\f[] file you should
+pass \f[C]\-\-netrc\f[] to youtube\-dl or place it in the configuration
+file (#configuration).
+.PP
+On Windows you may also need to setup the \f[C]%HOME%\f[] environment
+variable manually.
+.SH OUTPUT TEMPLATE
+.PP
+The \f[C]\-o\f[] option allows users to indicate a template for the
+output file names.
+.PP
+\f[B]tl;dr:\f[] navigate me to examples (#output-template-examples).
+.PP
+The basic usage is not to set any template arguments when downloading a
+single file, like in
+\f[C]youtube\-dl\ \-o\ funny_video.flv\ "http://some/video"\f[].
+However, it may contain special sequences that will be replaced when
+downloading each video.
+The special sequences have the format \f[C]%(NAME)s\f[].
+To clarify, that is a percent symbol followed by a name in parentheses,
+followed by a lowercase S.
+Allowed names are:
+.IP \[bu] 2
+\f[C]id\f[]: Video identifier
+.IP \[bu] 2
+\f[C]title\f[]: Video title
+.IP \[bu] 2
+\f[C]url\f[]: Video URL
+.IP \[bu] 2
+\f[C]ext\f[]: Video filename extension
+.IP \[bu] 2
+\f[C]alt_title\f[]: A secondary title of the video
+.IP \[bu] 2
+\f[C]display_id\f[]: An alternative identifier for the video
+.IP \[bu] 2
+\f[C]uploader\f[]: Full name of the video uploader
+.IP \[bu] 2
+\f[C]license\f[]: License name the video is licensed under
+.IP \[bu] 2
+\f[C]creator\f[]: The creator of the video
+.IP \[bu] 2
+\f[C]release_date\f[]: The date (YYYYMMDD) when the video was released
+.IP \[bu] 2
+\f[C]timestamp\f[]: UNIX timestamp of the moment the video became
+available
+.IP \[bu] 2
+\f[C]upload_date\f[]: Video upload date (YYYYMMDD)
+.IP \[bu] 2
+\f[C]uploader_id\f[]: Nickname or id of the video uploader
+.IP \[bu] 2
+\f[C]location\f[]: Physical location where the video was filmed
+.IP \[bu] 2
+\f[C]duration\f[]: Length of the video in seconds
+.IP \[bu] 2
+\f[C]view_count\f[]: How many users have watched the video on the
+platform
+.IP \[bu] 2
+\f[C]like_count\f[]: Number of positive ratings of the video
+.IP \[bu] 2
+\f[C]dislike_count\f[]: Number of negative ratings of the video
+.IP \[bu] 2
+\f[C]repost_count\f[]: Number of reposts of the video
+.IP \[bu] 2
+\f[C]average_rating\f[]: Average rating give by users, the scale used
+depends on the webpage
+.IP \[bu] 2
+\f[C]comment_count\f[]: Number of comments on the video
+.IP \[bu] 2
+\f[C]age_limit\f[]: Age restriction for the video (years)
+.IP \[bu] 2
+\f[C]format\f[]: A human\-readable description of the format
+.IP \[bu] 2
+\f[C]format_id\f[]: Format code specified by \f[C]\-\-format\f[]
+.IP \[bu] 2
+\f[C]format_note\f[]: Additional info about the format
+.IP \[bu] 2
+\f[C]width\f[]: Width of the video
+.IP \[bu] 2
+\f[C]height\f[]: Height of the video
+.IP \[bu] 2
+\f[C]resolution\f[]: Textual description of width and height
+.IP \[bu] 2
+\f[C]tbr\f[]: Average bitrate of audio and video in KBit/s
+.IP \[bu] 2
+\f[C]abr\f[]: Average audio bitrate in KBit/s
+.IP \[bu] 2
+\f[C]acodec\f[]: Name of the audio codec in use
+.IP \[bu] 2
+\f[C]asr\f[]: Audio sampling rate in Hertz
+.IP \[bu] 2
+\f[C]vbr\f[]: Average video bitrate in KBit/s
+.IP \[bu] 2
+\f[C]fps\f[]: Frame rate
+.IP \[bu] 2
+\f[C]vcodec\f[]: Name of the video codec in use
+.IP \[bu] 2
+\f[C]container\f[]: Name of the container format
+.IP \[bu] 2
+\f[C]filesize\f[]: The number of bytes, if known in advance
+.IP \[bu] 2
+\f[C]filesize_approx\f[]: An estimate for the number of bytes
+.IP \[bu] 2
+\f[C]protocol\f[]: The protocol that will be used for the actual
+download
+.IP \[bu] 2
+\f[C]extractor\f[]: Name of the extractor
+.IP \[bu] 2
+\f[C]extractor_key\f[]: Key name of the extractor
+.IP \[bu] 2
+\f[C]epoch\f[]: Unix epoch when creating the file
+.IP \[bu] 2
+\f[C]autonumber\f[]: Five\-digit number that will be increased with each
+download, starting at zero
+.IP \[bu] 2
+\f[C]playlist\f[]: Name or id of the playlist that contains the video
+.IP \[bu] 2
+\f[C]playlist_index\f[]: Index of the video in the playlist padded with
+leading zeros according to the total length of the playlist
+.IP \[bu] 2
+\f[C]playlist_id\f[]: Playlist identifier
+.IP \[bu] 2
+\f[C]playlist_title\f[]: Playlist title
+.PP
+Available for the video that belongs to some logical chapter or section:
+\- \f[C]chapter\f[]: Name or title of the chapter the video belongs to
+\- \f[C]chapter_number\f[]: Number of the chapter the video belongs to
+\- \f[C]chapter_id\f[]: Id of the chapter the video belongs to
+.PP
+Available for the video that is an episode of some series or programme:
+\- \f[C]series\f[]: Title of the series or programme the video episode
+belongs to \- \f[C]season\f[]: Title of the season the video episode
+belongs to \- \f[C]season_number\f[]: Number of the season the video
+episode belongs to \- \f[C]season_id\f[]: Id of the season the video
+episode belongs to \- \f[C]episode\f[]: Title of the video episode \-
+\f[C]episode_number\f[]: Number of the video episode within a season \-
+\f[C]episode_id\f[]: Id of the video episode
+.PP
+Available for the media that is a track or a part of a music album: \-
+\f[C]track\f[]: Title of the track \- \f[C]track_number\f[]: Number of
+the track within an album or a disc \- \f[C]track_id\f[]: Id of the
+track \- \f[C]artist\f[]: Artist(s) of the track \- \f[C]genre\f[]:
+Genre(s) of the track \- \f[C]album\f[]: Title of the album the track
+belongs to \- \f[C]album_type\f[]: Type of the album \-
+\f[C]album_artist\f[]: List of all artists appeared on the album \-
+\f[C]disc_number\f[]: Number of the disc or other physical medium the
+track belongs to \- \f[C]release_year\f[]: Year (YYYY) when the album
+was released
+.PP
+Each aforementioned sequence when referenced in an output template will
+be replaced by the actual value corresponding to the sequence name.
+Note that some of the sequences are not guaranteed to be present since
+they depend on the metadata obtained by a particular extractor.
+Such sequences will be replaced with \f[C]NA\f[].
+.PP
+For example for \f[C]\-o\ %(title)s\-%(id)s.%(ext)s\f[] and an mp4 video
+with title \f[C]youtube\-dl\ test\ video\f[] and id
+\f[C]BaW_jenozKcj\f[], this will result in a
+\f[C]youtube\-dl\ test\ video\-BaW_jenozKcj.mp4\f[] file created in the
+current directory.
+.PP
+Output templates can also contain arbitrary hierarchical path, e.g.
+\f[C]\-o\ \[aq]%(playlist)s/%(playlist_index)s\ \-\ %(title)s.%(ext)s\[aq]\f[]
+which will result in downloading each video in a directory corresponding
+to this path template.
+Any missing directory will be automatically created for you.
+.PP
+To use percent literals in an output template use \f[C]%%\f[].
+To output to stdout use \f[C]\-o\ \-\f[].
+.PP
+The current default template is \f[C]%(title)s\-%(id)s.%(ext)s\f[].
+.PP
+In some cases, you don\[aq]t want special characters such as 中, spaces,
+or &, such as when transferring the downloaded filename to a Windows
+system or the filename through an 8bit\-unsafe channel.
+In these cases, add the \f[C]\-\-restrict\-filenames\f[] flag to get a
+shorter title:
+.SS Output template and Windows batch files
+.PP
+If you are using an output template inside a Windows batch file then you
+must escape plain percent characters (\f[C]%\f[]) by doubling, so that
+\f[C]\-o\ "%(title)s\-%(id)s.%(ext)s"\f[] should become
+\f[C]\-o\ "%%(title)s\-%%(id)s.%%(ext)s"\f[].
+However you should not touch \f[C]%\f[]\[aq]s that are not plain
+characters, e.g.
+environment variables for expansion should stay intact:
+\f[C]\-o\ "C:\\%HOMEPATH%\\Desktop\\%%(title)s.%%(ext)s"\f[].
+.SS Output template examples
+.PP
+Note on Windows you may need to use double quotes instead of single.
+.IP
+.nf
+\f[C]
+$\ youtube\-dl\ \-\-get\-filename\ \-o\ \[aq]%(title)s.%(ext)s\[aq]\ BaW_jenozKc
+youtube\-dl\ test\ video\ \[aq]\[aq]_ä↭𝕐.mp4\ \ \ \ #\ All\ kinds\ of\ weird\ characters
+
+$\ youtube\-dl\ \-\-get\-filename\ \-o\ \[aq]%(title)s.%(ext)s\[aq]\ BaW_jenozKc\ \-\-restrict\-filenames
+youtube\-dl_test_video_.mp4\ \ \ \ \ \ \ \ \ \ #\ A\ simple\ file\ name
+
+#\ Download\ YouTube\ playlist\ videos\ in\ separate\ directory\ indexed\ by\ video\ order\ in\ a\ playlist
+$\ youtube\-dl\ \-o\ \[aq]%(playlist)s/%(playlist_index)s\ \-\ %(title)s.%(ext)s\[aq]\ https://www.youtube.com/playlist?list=PLwiyx1dc3P2JR9N8gQaQN_BCvlSlap7re
+
+#\ Download\ all\ playlists\ of\ YouTube\ channel/user\ keeping\ each\ playlist\ in\ separate\ directory:
+$\ youtube\-dl\ \-o\ \[aq]%(uploader)s/%(playlist)s/%(playlist_index)s\ \-\ %(title)s.%(ext)s\[aq]\ https://www.youtube.com/user/TheLinuxFoundation/playlists
+
+#\ Download\ Udemy\ course\ keeping\ each\ chapter\ in\ separate\ directory\ under\ MyVideos\ directory\ in\ your\ home
+$\ youtube\-dl\ \-u\ user\ \-p\ password\ \-o\ \[aq]~/MyVideos/%(playlist)s/%(chapter_number)s\ \-\ %(chapter)s/%(title)s.%(ext)s\[aq]\ https://www.udemy.com/java\-tutorial/
+
+#\ Download\ entire\ series\ season\ keeping\ each\ series\ and\ each\ season\ in\ separate\ directory\ under\ C:/MyVideos
+$\ youtube\-dl\ \-o\ "C:/MyVideos/%(series)s/%(season_number)s\ \-\ %(season)s/%(episode_number)s\ \-\ %(episode)s.%(ext)s"\ http://videomore.ru/kino_v_detalayah/5_sezon/367617
+
+#\ Stream\ the\ video\ being\ downloaded\ to\ stdout
+$\ youtube\-dl\ \-o\ \-\ BaW_jenozKc
+\f[]
+.fi
+.SH FORMAT SELECTION
+.PP
+By default youtube\-dl tries to download the best available quality,
+i.e.
+if you want the best quality you \f[B]don\[aq]t need\f[] to pass any
+special options, youtube\-dl will guess it for you by \f[B]default\f[].
+.PP
+But sometimes you may want to download in a different format, for
+example when you are on a slow or intermittent connection.
+The key mechanism for achieving this is so\-called \f[I]format
+selection\f[] based on which you can explicitly specify desired format,
+select formats based on some criterion or criteria, setup precedence and
+much more.
+.PP
+The general syntax for format selection is \f[C]\-\-format\ FORMAT\f[]
+or shorter \f[C]\-f\ FORMAT\f[] where \f[C]FORMAT\f[] is a \f[I]selector
+expression\f[], i.e.
+an expression that describes format or formats you would like to
+download.
+.PP
+\f[B]tl;dr:\f[] navigate me to examples (#format-selection-examples).
+.PP
+The simplest case is requesting a specific format, for example with
+\f[C]\-f\ 22\f[] you can download the format with format code equal to
+22.
+You can get the list of available format codes for particular video
+using \f[C]\-\-list\-formats\f[] or \f[C]\-F\f[].
+Note that these format codes are extractor specific.
+.PP
+You can also use a file extension (currently \f[C]3gp\f[], \f[C]aac\f[],
+\f[C]flv\f[], \f[C]m4a\f[], \f[C]mp3\f[], \f[C]mp4\f[], \f[C]ogg\f[],
+\f[C]wav\f[], \f[C]webm\f[] are supported) to download the best quality
+format of a particular file extension served as a single file, e.g.
+\f[C]\-f\ webm\f[] will download the best quality format with the
+\f[C]webm\f[] extension served as a single file.
+.PP
+You can also use special names to select particular edge case formats:
+\- \f[C]best\f[]: Select the best quality format represented by a single
+file with video and audio.
+\- \f[C]worst\f[]: Select the worst quality format represented by a
+single file with video and audio.
+\- \f[C]bestvideo\f[]: Select the best quality video\-only format (e.g.
+DASH video).
+May not be available.
+\- \f[C]worstvideo\f[]: Select the worst quality video\-only format.
+May not be available.
+\- \f[C]bestaudio\f[]: Select the best quality audio only\-format.
+May not be available.
+\- \f[C]worstaudio\f[]: Select the worst quality audio only\-format.
+May not be available.
+.PP
+For example, to download the worst quality video\-only format you can
+use \f[C]\-f\ worstvideo\f[].
+.PP
+If you want to download multiple videos and they don\[aq]t have the same
+formats available, you can specify the order of preference using
+slashes.
+Note that slash is left\-associative, i.e.
+formats on the left hand side are preferred, for example
+\f[C]\-f\ 22/17/18\f[] will download format 22 if it\[aq]s available,
+otherwise it will download format 17 if it\[aq]s available, otherwise it
+will download format 18 if it\[aq]s available, otherwise it will
+complain that no suitable formats are available for download.
+.PP
+If you want to download several formats of the same video use a comma as
+a separator, e.g.
+\f[C]\-f\ 22,17,18\f[] will download all these three formats, of course
+if they are available.
+Or a more sophisticated example combined with the precedence feature:
+\f[C]\-f\ 136/137/mp4/bestvideo,140/m4a/bestaudio\f[].
+.PP
+You can also filter the video formats by putting a condition in
+brackets, as in \f[C]\-f\ "best[height=720]"\f[] (or
+\f[C]\-f\ "[filesize>10M]"\f[]).
+.PP
+The following numeric meta fields can be used with comparisons
+\f[C]<\f[], \f[C]<=\f[], \f[C]>\f[], \f[C]>=\f[], \f[C]=\f[] (equals),
+\f[C]!=\f[] (not equals): \- \f[C]filesize\f[]: The number of bytes, if
+known in advance \- \f[C]width\f[]: Width of the video, if known \-
+\f[C]height\f[]: Height of the video, if known \- \f[C]tbr\f[]: Average
+bitrate of audio and video in KBit/s \- \f[C]abr\f[]: Average audio
+bitrate in KBit/s \- \f[C]vbr\f[]: Average video bitrate in KBit/s \-
+\f[C]asr\f[]: Audio sampling rate in Hertz \- \f[C]fps\f[]: Frame rate
+.PP
+Also filtering work for comparisons \f[C]=\f[] (equals), \f[C]!=\f[]
+(not equals), \f[C]^=\f[] (begins with), \f[C]$=\f[] (ends with),
+\f[C]*=\f[] (contains) and following string meta fields: \-
+\f[C]ext\f[]: File extension \- \f[C]acodec\f[]: Name of the audio codec
+in use \- \f[C]vcodec\f[]: Name of the video codec in use \-
+\f[C]container\f[]: Name of the container format \- \f[C]protocol\f[]:
+The protocol that will be used for the actual download, lower\-case
+(\f[C]http\f[], \f[C]https\f[], \f[C]rtsp\f[], \f[C]rtmp\f[],
+\f[C]rtmpe\f[], \f[C]mms\f[], \f[C]f4m\f[], \f[C]ism\f[], \f[C]m3u8\f[],
+or \f[C]m3u8_native\f[]) \- \f[C]format_id\f[]: A short description of
+the format
+.PP
+Note that none of the aforementioned meta fields are guaranteed to be
+present since this solely depends on the metadata obtained by particular
+extractor, i.e.
+the metadata offered by the video hoster.
+.PP
+Formats for which the value is not known are excluded unless you put a
+question mark (\f[C]?\f[]) after the operator.
+You can combine format filters, so
+\f[C]\-f\ "[height\ <=?\ 720][tbr>500]"\f[] selects up to 720p videos
+(or videos where the height is not known) with a bitrate of at least 500
+KBit/s.
+.PP
+You can merge the video and audio of two formats into a single file
+using \f[C]\-f\ <video\-format>+<audio\-format>\f[] (requires ffmpeg or
+avconv installed), for example \f[C]\-f\ bestvideo+bestaudio\f[] will
+download the best video\-only format, the best audio\-only format and
+mux them together with ffmpeg/avconv.
+.PP
+Format selectors can also be grouped using parentheses, for example if
+you want to download the best mp4 and webm formats with a height lower
+than 480 you can use \f[C]\-f\ \[aq](mp4,webm)[height<480]\[aq]\f[].
+.PP
+Since the end of April 2015 and version 2015.04.26, youtube\-dl uses
+\f[C]\-f\ bestvideo+bestaudio/best\f[] as the default format selection
+(see #5447 (https://github.com/rg3/youtube-dl/issues/5447),
+#5456 (https://github.com/rg3/youtube-dl/issues/5456)).
+If ffmpeg or avconv are installed this results in downloading
+\f[C]bestvideo\f[] and \f[C]bestaudio\f[] separately and muxing them
+together into a single file giving the best overall quality available.
+Otherwise it falls back to \f[C]best\f[] and results in downloading the
+best available quality served as a single file.
+\f[C]best\f[] is also needed for videos that don\[aq]t come from YouTube
+because they don\[aq]t provide the audio and video in two different
+files.
+If you want to only download some DASH formats (for example if you are
+not interested in getting videos with a resolution higher than 1080p),
+you can add \f[C]\-f\ bestvideo[height<=?1080]+bestaudio/best\f[] to
+your configuration file.
+Note that if you use youtube\-dl to stream to \f[C]stdout\f[] (and most
+likely to pipe it to your media player then), i.e.
+you explicitly specify output template as \f[C]\-o\ \-\f[], youtube\-dl
+still uses \f[C]\-f\ best\f[] format selection in order to start content
+delivery immediately to your player and not to wait until
+\f[C]bestvideo\f[] and \f[C]bestaudio\f[] are downloaded and muxed.
+.PP
+If you want to preserve the old format selection behavior (prior to
+youtube\-dl 2015.04.26), i.e.
+you want to download the best available quality media served as a single
+file, you should explicitly specify your choice with \f[C]\-f\ best\f[].
+You may want to add it to the configuration file (#configuration) in
+order not to type it every time you run youtube\-dl.
+.SS Format selection examples
+.PP
+Note on Windows you may need to use double quotes instead of single.
+.IP
+.nf
+\f[C]
+#\ Download\ best\ mp4\ format\ available\ or\ any\ other\ best\ if\ no\ mp4\ available
+$\ youtube\-dl\ \-f\ \[aq]bestvideo[ext=mp4]+bestaudio[ext=m4a]/best[ext=mp4]/best\[aq]
+
+#\ Download\ best\ format\ available\ but\ not\ better\ that\ 480p
+$\ youtube\-dl\ \-f\ \[aq]bestvideo[height<=480]+bestaudio/best[height<=480]\[aq]
+
+#\ Download\ best\ video\ only\ format\ but\ no\ bigger\ than\ 50\ MB
+$\ youtube\-dl\ \-f\ \[aq]best[filesize<50M]\[aq]
+
+#\ Download\ best\ format\ available\ via\ direct\ link\ over\ HTTP/HTTPS\ protocol
+$\ youtube\-dl\ \-f\ \[aq](bestvideo+bestaudio/best)[protocol^=http]\[aq]
+
+#\ Download\ the\ best\ video\ format\ and\ the\ best\ audio\ format\ without\ merging\ them
+$\ youtube\-dl\ \-f\ \[aq]bestvideo,bestaudio\[aq]\ \-o\ \[aq]%(title)s.f%(format_id)s.%(ext)s\[aq]
+\f[]
+.fi
+.PP
+Note that in the last example, an output template is recommended as
+bestvideo and bestaudio may have the same file name.
+.SH VIDEO SELECTION
+.PP
+Videos can be filtered by their upload date using the options
+\f[C]\-\-date\f[], \f[C]\-\-datebefore\f[] or \f[C]\-\-dateafter\f[].
+They accept dates in two formats:
+.IP \[bu] 2
+Absolute dates: Dates in the format \f[C]YYYYMMDD\f[].
+.IP \[bu] 2
+Relative dates: Dates in the format
+\f[C](now|today)[+\-][0\-9](day|week|month|year)(s)?\f[]
+.PP
+Examples:
+.IP
+.nf
+\f[C]
+#\ Download\ only\ the\ videos\ uploaded\ in\ the\ last\ 6\ months
+$\ youtube\-dl\ \-\-dateafter\ now\-6months
+
+#\ Download\ only\ the\ videos\ uploaded\ on\ January\ 1,\ 1970
+$\ youtube\-dl\ \-\-date\ 19700101
+
+$\ #\ Download\ only\ the\ videos\ uploaded\ in\ the\ 200x\ decade
+$\ youtube\-dl\ \-\-dateafter\ 20000101\ \-\-datebefore\ 20091231
+\f[]
+.fi
+.SH FAQ
+.SS How do I update youtube\-dl?
+.PP
+If you\[aq]ve followed our manual installation
+instructions (http://rg3.github.io/youtube-dl/download.html), you can
+simply run \f[C]youtube\-dl\ \-U\f[] (or, on Linux,
+\f[C]sudo\ youtube\-dl\ \-U\f[]).
+.PP
+If you have used pip, a simple
+\f[C]sudo\ pip\ install\ \-U\ youtube\-dl\f[] is sufficient to update.
+.PP
+If you have installed youtube\-dl using a package manager like
+\f[I]apt\-get\f[] or \f[I]yum\f[], use the standard system update
+mechanism to update.
+Note that distribution packages are often outdated.
+As a rule of thumb, youtube\-dl releases at least once a month, and
+often weekly or even daily.
+Simply go to http://yt\-dl.org/ to find out the current version.
+Unfortunately, there is nothing we youtube\-dl developers can do if your
+distribution serves a really outdated version.
+You can (and should) complain to your distribution in their bugtracker
+or support forum.
+.PP
+As a last resort, you can also uninstall the version installed by your
+package manager and follow our manual installation instructions.
+For that, remove the distribution\[aq]s package, with a line like
+.IP
+.nf
+\f[C]
+sudo\ apt\-get\ remove\ \-y\ youtube\-dl
+\f[]
+.fi
+.PP
+Afterwards, simply follow our manual installation
+instructions (http://rg3.github.io/youtube-dl/download.html):
+.IP
+.nf
+\f[C]
+sudo\ wget\ https://yt\-dl.org/latest/youtube\-dl\ \-O\ /usr/local/bin/youtube\-dl
+sudo\ chmod\ a+x\ /usr/local/bin/youtube\-dl
+hash\ \-r
+\f[]
+.fi
+.PP
+Again, from then on you\[aq]ll be able to update with
+\f[C]sudo\ youtube\-dl\ \-U\f[].
+.SS youtube\-dl is extremely slow to start on Windows
+.PP
+Add a file exclusion for \f[C]youtube\-dl.exe\f[] in Windows Defender
+settings.
+.SS I\[aq]m getting an error
+\f[C]Unable\ to\ extract\ OpenGraph\ title\f[] on YouTube playlists
+.PP
+YouTube changed their playlist format in March 2014 and later on, so
+you\[aq]ll need at least youtube\-dl 2014.07.25 to download all YouTube
+videos.
+.PP
+If you have installed youtube\-dl with a package manager, pip, setup.py
+or a tarball, please use that to update.
+Note that Ubuntu packages do not seem to get updated anymore.
+Since we are not affiliated with Ubuntu, there is little we can do.
+Feel free to report
+bugs (https://bugs.launchpad.net/ubuntu/+source/youtube-dl/+filebug) to
+the Ubuntu packaging
+people (mailto:ubuntu-motu@lists.ubuntu.com?subject=outdated%20version%20of%20youtube-dl)
+\- all they have to do is update the package to a somewhat recent
+version.
+See above for a way to update.
+.SS I\[aq]m getting an error when trying to use output template:
+\f[C]error:\ using\ output\ template\ conflicts\ with\ using\ title,\ video\ ID\ or\ auto\ number\f[]
+.PP
+Make sure you are not using \f[C]\-o\f[] with any of these options
+\f[C]\-t\f[], \f[C]\-\-title\f[], \f[C]\-\-id\f[], \f[C]\-A\f[] or
+\f[C]\-\-auto\-number\f[] set in command line or in a configuration
+file.
+Remove the latter if any.
+.SS Do I always have to pass \f[C]\-citw\f[]?
+.PP
+By default, youtube\-dl intends to have the best options (incidentally,
+if you have a convincing case that these should be different, please
+file an issue where you explain that (https://yt-dl.org/bug)).
+Therefore, it is unnecessary and sometimes harmful to copy long option
+strings from webpages.
+In particular, the only option out of \f[C]\-citw\f[] that is regularly
+useful is \f[C]\-i\f[].
+.SS Can you please put the \f[C]\-b\f[] option back?
+.PP
+Most people asking this question are not aware that youtube\-dl now
+defaults to downloading the highest available quality as reported by
+YouTube, which will be 1080p or 720p in some cases, so you no longer
+need the \f[C]\-b\f[] option.
+For some specific videos, maybe YouTube does not report them to be
+available in a specific high quality format you\[aq]re interested in.
+In that case, simply request it with the \f[C]\-f\f[] option and
+youtube\-dl will try to download it.
+.SS I get HTTP error 402 when trying to download a video. What\[aq]s
+this?
+.PP
+Apparently YouTube requires you to pass a CAPTCHA test if you download
+too much.
+We\[aq]re considering to provide a way to let you solve the
+CAPTCHA (https://github.com/rg3/youtube-dl/issues/154), but at the
+moment, your best course of action is pointing a web browser to the
+youtube URL, solving the CAPTCHA, and restart youtube\-dl.
+.SS Do I need any other programs?
+.PP
+youtube\-dl works fine on its own on most sites.
+However, if you want to convert video/audio, you\[aq]ll need
+avconv (https://libav.org/) or ffmpeg (https://www.ffmpeg.org/).
+On some sites \- most notably YouTube \- videos can be retrieved in a
+higher quality format without sound.
+youtube\-dl will detect whether avconv/ffmpeg is present and
+automatically pick the best option.
+.PP
+Videos or video formats streamed via RTMP protocol can only be
+downloaded when rtmpdump (https://rtmpdump.mplayerhq.hu/) is installed.
+Downloading MMS and RTSP videos requires either
+mplayer (http://mplayerhq.hu/) or mpv (https://mpv.io/) to be installed.
+.SS I have downloaded a video but how can I play it?
+.PP
+Once the video is fully downloaded, use any video player, such as
+mpv (https://mpv.io/), vlc (http://www.videolan.org/) or
+mplayer (http://www.mplayerhq.hu/).
+.SS I extracted a video URL with \f[C]\-g\f[], but it does not play on
+another machine / in my web browser.
+.PP
+It depends a lot on the service.
+In many cases, requests for the video (to download/play it) must come
+from the same IP address and with the same cookies and/or HTTP headers.
+Use the \f[C]\-\-cookies\f[] option to write the required cookies into a
+file, and advise your downloader to read cookies from that file.
+Some sites also require a common user agent to be used, use
+\f[C]\-\-dump\-user\-agent\f[] to see the one in use by youtube\-dl.
+You can also get necessary cookies and HTTP headers from JSON output
+obtained with \f[C]\-\-dump\-json\f[].
+.PP
+It may be beneficial to use IPv6; in some cases, the restrictions are
+only applied to IPv4.
+Some services (sometimes only for a subset of videos) do not restrict
+the video URL by IP address, cookie, or user\-agent, but these are the
+exception rather than the rule.
+.PP
+Please bear in mind that some URL protocols are \f[B]not\f[] supported
+by browsers out of the box, including RTMP.
+If you are using \f[C]\-g\f[], your own downloader must support these as
+well.
+.PP
+If you want to play the video on a machine that is not running
+youtube\-dl, you can relay the video content from the machine that runs
+youtube\-dl.
+You can use \f[C]\-o\ \-\f[] to let youtube\-dl stream a video to
+stdout, or simply allow the player to download the files written by
+youtube\-dl in turn.
+.SS ERROR: no fmt_url_map or conn information found in video info
+.PP
+YouTube has switched to a new video info format in July 2011 which is
+not supported by old versions of youtube\-dl.
+See above (#how-do-i-update-youtube-dl) for how to update youtube\-dl.
+.SS ERROR: unable to download video
+.PP
+YouTube requires an additional signature since September 2012 which is
+not supported by old versions of youtube\-dl.
+See above (#how-do-i-update-youtube-dl) for how to update youtube\-dl.
+.SS Video URL contains an ampersand and I\[aq]m getting some strange
+output \f[C][1]\ 2839\f[] or
+\f[C]\[aq]v\[aq]\ is\ not\ recognized\ as\ an\ internal\ or\ external\ command\f[]
+.PP
+That\[aq]s actually the output from your shell.
+Since ampersand is one of the special shell characters it\[aq]s
+interpreted by the shell preventing you from passing the whole URL to
+youtube\-dl.
+To disable your shell from interpreting the ampersands (or any other
+special characters) you have to either put the whole URL in quotes or
+escape them with a backslash (which approach will work depends on your
+shell).
+.PP
+For example if your URL is
+https://www.youtube.com/watch?t=4&v=BaW_jenozKc you should end up with
+following command:
+.PP
+\f[C]youtube\-dl\ \[aq]https://www.youtube.com/watch?t=4&v=BaW_jenozKc\[aq]\f[]
+.PP
+or
+.PP
+\f[C]youtube\-dl\ https://www.youtube.com/watch?t=4\\&v=BaW_jenozKc\f[]
+.PP
+For Windows you have to use the double quotes:
+.PP
+\f[C]youtube\-dl\ "https://www.youtube.com/watch?t=4&v=BaW_jenozKc"\f[]
+.SS ExtractorError: Could not find JS function u\[aq]OF\[aq]
+.PP
+In February 2015, the new YouTube player contained a character sequence
+in a string that was misinterpreted by old versions of youtube\-dl.
+See above (#how-do-i-update-youtube-dl) for how to update youtube\-dl.
+.SS HTTP Error 429: Too Many Requests or 402: Payment Required
+.PP
+These two error codes indicate that the service is blocking your IP
+address because of overuse.
+Contact the service and ask them to unblock your IP address, or \- if
+you have acquired a whitelisted IP address already \- use the
+\f[C]\-\-proxy\f[] or \f[C]\-\-source\-address\f[]
+options (#network-options) to select another IP address.
+.SS SyntaxError: Non\-ASCII character
+.PP
+The error
+.IP
+.nf
+\f[C]
+File\ "youtube\-dl",\ line\ 2
+SyntaxError:\ Non\-ASCII\ character\ \[aq]\\x93\[aq]\ ...
+\f[]
+.fi
+.PP
+means you\[aq]re using an outdated version of Python.
+Please update to Python 2.6 or 2.7.
+.SS What is this binary file? Where has the code gone?
+.PP
+Since June 2012 (#342 (https://github.com/rg3/youtube-dl/issues/342))
+youtube\-dl is packed as an executable zipfile, simply unzip it (might
+need renaming to \f[C]youtube\-dl.zip\f[] first on some systems) or
+clone the git repository, as laid out above.
+If you modify the code, you can run it by executing the
+\f[C]__main__.py\f[] file.
+To recompile the executable, run \f[C]make\ youtube\-dl\f[].
+.SS The exe throws an error due to missing \f[C]MSVCR100.dll\f[]
+.PP
+To run the exe you need to install first the Microsoft Visual C++ 2010
+Redistributable Package
+(x86) (https://www.microsoft.com/en-US/download/details.aspx?id=5555).
+.SS On Windows, how should I set up ffmpeg and youtube\-dl? Where should
+I put the exe files?
+.PP
+If you put youtube\-dl and ffmpeg in the same directory that you\[aq]re
+running the command from, it will work, but that\[aq]s rather
+cumbersome.
+.PP
+To make a different directory work \- either for ffmpeg, or for
+youtube\-dl, or for both \- simply create the directory (say,
+\f[C]C:\\bin\f[], or \f[C]C:\\Users\\<User\ name>\\bin\f[]), put all the
+executables directly in there, and then set your PATH environment
+variable (https://www.java.com/en/download/help/path.xml) to include
+that directory.
+.PP
+From then on, after restarting your shell, you will be able to access
+both youtube\-dl and ffmpeg (and youtube\-dl will be able to find
+ffmpeg) by simply typing \f[C]youtube\-dl\f[] or \f[C]ffmpeg\f[], no
+matter what directory you\[aq]re in.
+.SS How do I put downloads into a specific folder?
+.PP
+Use the \f[C]\-o\f[] to specify an output template (#output-template),
+for example \f[C]\-o\ "/home/user/videos/%(title)s\-%(id)s.%(ext)s"\f[].
+If you want this for all of your downloads, put the option into your
+configuration file (#configuration).
+.SS How do I download a video starting with a \f[C]\-\f[]?
+.PP
+Either prepend \f[C]http://www.youtube.com/watch?v=\f[] or separate the
+ID from the options with \f[C]\-\-\f[]:
+.IP
+.nf
+\f[C]
+youtube\-dl\ \-\-\ \-wNyEUrxzFU
+youtube\-dl\ "http://www.youtube.com/watch?v=\-wNyEUrxzFU"
+\f[]
+.fi
+.SS How do I pass cookies to youtube\-dl?
+.PP
+Use the \f[C]\-\-cookies\f[] option, for example
+\f[C]\-\-cookies\ /path/to/cookies/file.txt\f[].
+.PP
+In order to extract cookies from browser use any conforming browser
+extension for exporting cookies.
+For example,
+cookies.txt (https://chrome.google.com/webstore/detail/cookiestxt/njabckikapfpffapmjgojcnbfjonfjfg)
+(for Chrome) or Export
+Cookies (https://addons.mozilla.org/en-US/firefox/addon/export-cookies/)
+(for Firefox).
+.PP
+Note that the cookies file must be in Mozilla/Netscape format and the
+first line of the cookies file must be either
+\f[C]#\ HTTP\ Cookie\ File\f[] or
+\f[C]#\ Netscape\ HTTP\ Cookie\ File\f[].
+Make sure you have correct newline
+format (https://en.wikipedia.org/wiki/Newline) in the cookies file and
+convert newlines if necessary to correspond with your OS, namely
+\f[C]CRLF\f[] (\f[C]\\r\\n\f[]) for Windows and \f[C]LF\f[]
+(\f[C]\\n\f[]) for Unix and Unix\-like systems (Linux, Mac OS, etc.).
+\f[C]HTTP\ Error\ 400:\ Bad\ Request\f[] when using \f[C]\-\-cookies\f[]
+is a good sign of invalid newline format.
+.PP
+Passing cookies to youtube\-dl is a good way to workaround login when a
+particular extractor does not implement it explicitly.
+Another use case is working around
+CAPTCHA (https://en.wikipedia.org/wiki/CAPTCHA) some websites require
+you to solve in particular cases in order to get access (e.g.
+YouTube, CloudFlare).
+.SS How do I stream directly to media player?
+.PP
+You will first need to tell youtube\-dl to stream media to stdout with
+\f[C]\-o\ \-\f[], and also tell your media player to read from stdin (it
+must be capable of this for streaming) and then pipe former to latter.
+For example, streaming to vlc (http://www.videolan.org/) can be achieved
+with:
+.IP
+.nf
+\f[C]
+youtube\-dl\ \-o\ \-\ "http://www.youtube.com/watch?v=BaW_jenozKcj"\ |\ vlc\ \-
+\f[]
+.fi
+.SS How do I download only new videos from a playlist?
+.PP
+Use download\-archive feature.
+With this feature you should initially download the complete playlist
+with \f[C]\-\-download\-archive\ /path/to/download/archive/file.txt\f[]
+that will record identifiers of all the videos in a special file.
+Each subsequent run with the same \f[C]\-\-download\-archive\f[] will
+download only new videos and skip all videos that have been downloaded
+before.
+Note that only successful downloads are recorded in the file.
+.PP
+For example, at first,
+.IP
+.nf
+\f[C]
+youtube\-dl\ \-\-download\-archive\ archive.txt\ "https://www.youtube.com/playlist?list=PLwiyx1dc3P2JR9N8gQaQN_BCvlSlap7re"
+\f[]
+.fi
+.PP
+will download the complete \f[C]PLwiyx1dc3P2JR9N8gQaQN_BCvlSlap7re\f[]
+playlist and create a file \f[C]archive.txt\f[].
+Each subsequent run will only download new videos if any:
+.IP
+.nf
+\f[C]
+youtube\-dl\ \-\-download\-archive\ archive.txt\ "https://www.youtube.com/playlist?list=PLwiyx1dc3P2JR9N8gQaQN_BCvlSlap7re"
+\f[]
+.fi
+.SS Should I add \f[C]\-\-hls\-prefer\-native\f[] into my config?
+.PP
+When youtube\-dl detects an HLS video, it can download it either with
+the built\-in downloader or ffmpeg.
+Since many HLS streams are slightly invalid and ffmpeg/youtube\-dl each
+handle some invalid cases better than the other, there is an option to
+switch the downloader if needed.
+.PP
+When youtube\-dl knows that one particular downloader works better for a
+given website, that downloader will be picked.
+Otherwise, youtube\-dl will pick the best downloader for general
+compatibility, which at the moment happens to be ffmpeg.
+This choice may change in future versions of youtube\-dl, with
+improvements of the built\-in downloader and/or ffmpeg.
+.PP
+In particular, the generic extractor (used when your website is not in
+the list of supported sites by
+youtube\-dl (http://rg3.github.io/youtube-dl/supportedsites.html) cannot
+mandate one specific downloader.
+.PP
+If you put either \f[C]\-\-hls\-prefer\-native\f[] or
+\f[C]\-\-hls\-prefer\-ffmpeg\f[] into your configuration, a different
+subset of videos will fail to download correctly.
+Instead, it is much better to file an issue (https://yt-dl.org/bug) or a
+pull request which details why the native or the ffmpeg HLS downloader
+is a better choice for your use case.
+.SS Can you add support for this anime video site, or site which shows
+current movies for free?
+.PP
+As a matter of policy (as well as legality), youtube\-dl does not
+include support for services that specialize in infringing copyright.
+As a rule of thumb, if you cannot easily find a video that the service
+is quite obviously allowed to distribute (i.e.
+that has been uploaded by the creator, the creator\[aq]s distributor, or
+is published under a free license), the service is probably unfit for
+inclusion to youtube\-dl.
+.PP
+A note on the service that they don\[aq]t host the infringing content,
+but just link to those who do, is evidence that the service should
+\f[B]not\f[] be included into youtube\-dl.
+The same goes for any DMCA note when the whole front page of the service
+is filled with videos they are not allowed to distribute.
+A "fair use" note is equally unconvincing if the service shows
+copyright\-protected videos in full without authorization.
+.PP
+Support requests for services that \f[B]do\f[] purchase the rights to
+distribute their content are perfectly fine though.
+If in doubt, you can simply include a source that mentions the
+legitimate purchase of content.
+.SS How can I speed up work on my issue?
+.PP
+(Also known as: Help, my important issue not being solved!) The
+youtube\-dl core developer team is quite small.
+While we do our best to solve as many issues as possible, sometimes that
+can take quite a while.
+To speed up your issue, here\[aq]s what you can do:
+.PP
+First of all, please do report the issue at our issue
+tracker (https://yt-dl.org/bugs).
+That allows us to coordinate all efforts by users and developers, and
+serves as a unified point.
+Unfortunately, the youtube\-dl project has grown too large to use
+personal email as an effective communication channel.
+.PP
+Please read the bug reporting instructions (#bugs) below.
+A lot of bugs lack all the necessary information.
+If you can, offer proxy, VPN, or shell access to the youtube\-dl
+developers.
+If you are able to, test the issue from multiple computers in multiple
+countries to exclude local censorship or misconfiguration issues.
+.PP
+If nobody is interested in solving your issue, you are welcome to take
+matters into your own hands and submit a pull request (or coerce/pay
+somebody else to do so).
+.PP
+Feel free to bump the issue from time to time by writing a small comment
+("Issue is still present in youtube\-dl version ...from France, but
+fixed from Belgium"), but please not more than once a month.
+Please do not declare your issue as \f[C]important\f[] or
+\f[C]urgent\f[].
+.SS How can I detect whether a given URL is supported by youtube\-dl?
+.PP
+For one, have a look at the list of supported
+sites (docs/supportedsites.md).
+Note that it can sometimes happen that the site changes its URL scheme
+(say, from http://example.com/video/1234567 to
+http://example.com/v/1234567 ) and youtube\-dl reports an URL of a
+service in that list as unsupported.
+In that case, simply report a bug.
+.PP
+It is \f[I]not\f[] possible to detect whether a URL is supported or not.
+That\[aq]s because youtube\-dl contains a generic extractor which
+matches \f[B]all\f[] URLs.
+You may be tempted to disable, exclude, or remove the generic extractor,
+but the generic extractor not only allows users to extract videos from
+lots of websites that embed a video from another service, but may also
+be used to extract video from a service that it\[aq]s hosting itself.
+Therefore, we neither recommend nor support disabling, excluding, or
+removing the generic extractor.
+.PP
+If you want to find out whether a given URL is supported, simply call
+youtube\-dl with it.
+If you get no videos back, chances are the URL is either not referring
+to a video or unsupported.
+You can find out which by examining the output (if you run youtube\-dl
+on the console) or catching an \f[C]UnsupportedError\f[] exception if
+you run it from a Python program.
+.SH Why do I need to go through that much red tape when filing bugs?
+.PP
+Before we had the issue template, despite our extensive bug reporting
+instructions (#bugs), about 80% of the issue reports we got were
+useless, for instance because people used ancient versions hundreds of
+releases old, because of simple syntactic errors (not in youtube\-dl but
+in general shell usage), because the problem was already reported
+multiple times before, because people did not actually read an error
+message, even if it said "please install ffmpeg", because people did not
+mention the URL they were trying to download and many more simple,
+easy\-to\-avoid problems, many of whom were totally unrelated to
+youtube\-dl.
+.PP
+youtube\-dl is an open\-source project manned by too few volunteers, so
+we\[aq]d rather spend time fixing bugs where we are certain none of
+those simple problems apply, and where we can be reasonably confident to
+be able to reproduce the issue without asking the reporter repeatedly.
+As such, the output of \f[C]youtube\-dl\ \-v\ YOUR_URL_HERE\f[] is
+really all that\[aq]s required to file an issue.
+The issue template also guides you through some basic steps you can do,
+such as checking that your version of youtube\-dl is current.
+.SH DEVELOPER INSTRUCTIONS
+.PP
+Most users do not need to build youtube\-dl and can download the
+builds (http://rg3.github.io/youtube-dl/download.html) or get them from
+their distribution.
+.PP
+To run youtube\-dl as a developer, you don\[aq]t need to build anything
+either.
+Simply execute
+.IP
+.nf
+\f[C]
+python\ \-m\ youtube_dl
+\f[]
+.fi
+.PP
+To run the test, simply invoke your favorite test runner, or execute a
+test file directly; any of the following work:
+.IP
+.nf
+\f[C]
+python\ \-m\ unittest\ discover
+python\ test/test_download.py
+nosetests
+\f[]
+.fi
+.PP
+If you want to create a build of youtube\-dl yourself, you\[aq]ll need
+.IP \[bu] 2
+python
+.IP \[bu] 2
+make (only GNU make is supported)
+.IP \[bu] 2
+pandoc
+.IP \[bu] 2
+zip
+.IP \[bu] 2
+nosetests
+.SS Adding support for a new site
+.PP
+If you want to add support for a new site, first of all \f[B]make
+sure\f[] this site is \f[B]not dedicated to copyright
+infringement (README.md#can-you-add-support-for-this-anime-video-site-or-site-which-shows-current-movies-for-free)\f[].
+youtube\-dl does \f[B]not support\f[] such sites thus pull requests
+adding support for them \f[B]will be rejected\f[].
+.PP
+After you have ensured this site is distributing its content legally,
+you can follow this quick list (assuming your service is called
+\f[C]yourextractor\f[]):
+.IP " 1." 4
+Fork this repository (https://github.com/rg3/youtube-dl/fork)
+.IP " 2." 4
+Check out the source code with:
+.RS 4
+.IP
+.nf
+\f[C]
+git\ clone\ git\@github.com:YOUR_GITHUB_USERNAME/youtube\-dl.git
+\f[]
+.fi
+.RE
+.IP " 3." 4
+Start a new git branch with
+.RS 4
+.IP
+.nf
+\f[C]
+cd\ youtube\-dl
+git\ checkout\ \-b\ yourextractor
+\f[]
+.fi
+.RE
+.IP " 4." 4
+Start with this simple template and save it to
+\f[C]youtube_dl/extractor/yourextractor.py\f[]:
+.RS 4
+.IP
+.nf
+\f[C]
+#\ coding:\ utf\-8
+from\ __future__\ import\ unicode_literals
+
+from\ .common\ import\ InfoExtractor
+
+
+class\ YourExtractorIE(InfoExtractor):
+\ \ \ \ _VALID_URL\ =\ r\[aq]https?://(?:www\\.)?yourextractor\\.com/watch/(?P<id>[0\-9]+)\[aq]
+\ \ \ \ _TEST\ =\ {
+\ \ \ \ \ \ \ \ \[aq]url\[aq]:\ \[aq]http://yourextractor.com/watch/42\[aq],
+\ \ \ \ \ \ \ \ \[aq]md5\[aq]:\ \[aq]TODO:\ md5\ sum\ of\ the\ first\ 10241\ bytes\ of\ the\ video\ file\ (use\ \-\-test)\[aq],
+\ \ \ \ \ \ \ \ \[aq]info_dict\[aq]:\ {
+\ \ \ \ \ \ \ \ \ \ \ \ \[aq]id\[aq]:\ \[aq]42\[aq],
+\ \ \ \ \ \ \ \ \ \ \ \ \[aq]ext\[aq]:\ \[aq]mp4\[aq],
+\ \ \ \ \ \ \ \ \ \ \ \ \[aq]title\[aq]:\ \[aq]Video\ title\ goes\ here\[aq],
+\ \ \ \ \ \ \ \ \ \ \ \ \[aq]thumbnail\[aq]:\ r\[aq]re:^https?://.*\\.jpg$\[aq],
+\ \ \ \ \ \ \ \ \ \ \ \ #\ TODO\ more\ properties,\ either\ as:
+\ \ \ \ \ \ \ \ \ \ \ \ #\ *\ A\ value
+\ \ \ \ \ \ \ \ \ \ \ \ #\ *\ MD5\ checksum;\ start\ the\ string\ with\ md5:
+\ \ \ \ \ \ \ \ \ \ \ \ #\ *\ A\ regular\ expression;\ start\ the\ string\ with\ re:
+\ \ \ \ \ \ \ \ \ \ \ \ #\ *\ Any\ Python\ type\ (for\ example\ int\ or\ float)
+\ \ \ \ \ \ \ \ }
+\ \ \ \ }
+
+\ \ \ \ def\ _real_extract(self,\ url):
+\ \ \ \ \ \ \ \ video_id\ =\ self._match_id(url)
+\ \ \ \ \ \ \ \ webpage\ =\ self._download_webpage(url,\ video_id)
+
+\ \ \ \ \ \ \ \ #\ TODO\ more\ code\ goes\ here,\ for\ example\ ...
+\ \ \ \ \ \ \ \ title\ =\ self._html_search_regex(r\[aq]<h1>(.+?)</h1>\[aq],\ webpage,\ \[aq]title\[aq])
+
+\ \ \ \ \ \ \ \ return\ {
+\ \ \ \ \ \ \ \ \ \ \ \ \[aq]id\[aq]:\ video_id,
+\ \ \ \ \ \ \ \ \ \ \ \ \[aq]title\[aq]:\ title,
+\ \ \ \ \ \ \ \ \ \ \ \ \[aq]description\[aq]:\ self._og_search_description(webpage),
+\ \ \ \ \ \ \ \ \ \ \ \ \[aq]uploader\[aq]:\ self._search_regex(r\[aq]<div[^>]+id="uploader"[^>]*>([^<]+)<\[aq],\ webpage,\ \[aq]uploader\[aq],\ fatal=False),
+\ \ \ \ \ \ \ \ \ \ \ \ #\ TODO\ more\ properties\ (see\ youtube_dl/extractor/common.py)
+\ \ \ \ \ \ \ \ }
+\f[]
+.fi
+.RE
+.IP " 5." 4
+Add an import in
+\f[C]youtube_dl/extractor/extractors.py\f[] (https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/extractors.py).
+.IP " 6." 4
+Run
+\f[C]python\ test/test_download.py\ TestDownload.test_YourExtractor\f[].
+This \f[I]should fail\f[] at first, but you can continually re\-run it
+until you\[aq]re done.
+If you decide to add more than one test, then rename \f[C]_TEST\f[] to
+\f[C]_TESTS\f[] and make it into a list of dictionaries.
+The tests will then be named \f[C]TestDownload.test_YourExtractor\f[],
+\f[C]TestDownload.test_YourExtractor_1\f[],
+\f[C]TestDownload.test_YourExtractor_2\f[], etc.
+.IP " 7." 4
+Have a look at
+\f[C]youtube_dl/extractor/common.py\f[] (https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py)
+for possible helper methods and a detailed description of what your
+extractor should and may
+return (https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py#L74-L252).
+Add tests and code for as many as you want.
+.IP " 8." 4
+Make sure your code follows youtube\-dl coding
+conventions (#youtube-dl-coding-conventions) and check the code with
+flake8 (https://pypi.python.org/pypi/flake8).
+Also make sure your code works under all Python (http://www.python.org/)
+versions claimed supported by youtube\-dl, namely 2.6, 2.7, and 3.2+.
+.IP " 9." 4
+When the tests pass, add (http://git-scm.com/docs/git-add) the new files
+and commit (http://git-scm.com/docs/git-commit) them and
+push (http://git-scm.com/docs/git-push) the result, like this:
+.RS 4
+.IP
+.nf
+\f[C]
+$\ git\ add\ youtube_dl/extractor/extractors.py
+$\ git\ add\ youtube_dl/extractor/yourextractor.py
+$\ git\ commit\ \-m\ \[aq][yourextractor]\ Add\ new\ extractor\[aq]
+$\ git\ push\ origin\ yourextractor
+\f[]
+.fi
+.RE
+.IP "10." 4
+Finally, create a pull
+request (https://help.github.com/articles/creating-a-pull-request).
+We\[aq]ll then review and merge it.
+.PP
+In any case, thank you very much for your contributions!
+.SS youtube\-dl coding conventions
+.PP
+This section introduces a guide lines for writing idiomatic, robust and
+future\-proof extractor code.
+.PP
+Extractors are very fragile by nature since they depend on the layout of
+the source data provided by 3rd party media hosters out of your control
+and this layout tends to change.
+As an extractor implementer your task is not only to write code that
+will extract media links and metadata correctly but also to minimize
+dependency on the source\[aq]s layout and even to make the code foresee
+potential future changes and be ready for that.
+This is important because it will allow the extractor not to break on
+minor layout changes thus keeping old youtube\-dl versions working.
+Even though this breakage issue is easily fixed by emitting a new
+version of youtube\-dl with a fix incorporated, all the previous
+versions become broken in all repositories and distros\[aq] packages
+that may not be so prompt in fetching the update from us.
+Needless to say, some non rolling release distros may never receive an
+update at all.
+.SS Mandatory and optional metafields
+.PP
+For extraction to work youtube\-dl relies on metadata your extractor
+extracts and provides to youtube\-dl expressed by an information
+dictionary (https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py#L75-L257)
+or simply \f[I]info dict\f[].
+Only the following meta fields in the \f[I]info dict\f[] are considered
+mandatory for a successful extraction process by youtube\-dl:
+.IP \[bu] 2
+\f[C]id\f[] (media identifier)
+.IP \[bu] 2
+\f[C]title\f[] (media title)
+.IP \[bu] 2
+\f[C]url\f[] (media download URL) or \f[C]formats\f[]
+.PP
+In fact only the last option is technically mandatory (i.e.
+if you can\[aq]t figure out the download location of the media the
+extraction does not make any sense).
+But by convention youtube\-dl also treats \f[C]id\f[] and \f[C]title\f[]
+as mandatory.
+Thus the aforementioned metafields are the critical data that the
+extraction does not make any sense without and if any of them fail to be
+extracted then the extractor is considered completely broken.
+.PP
+Any
+field (https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py#L149-L257)
+apart from the aforementioned ones are considered \f[B]optional\f[].
+That means that extraction should be \f[B]tolerant\f[] to situations
+when sources for these fields can potentially be unavailable (even if
+they are always available at the moment) and \f[B]future\-proof\f[] in
+order not to break the extraction of general purpose mandatory fields.
+.SS Example
+.PP
+Say you have some source dictionary \f[C]meta\f[] that you\[aq]ve
+fetched as JSON with HTTP request and it has a key \f[C]summary\f[]:
+.IP
+.nf
+\f[C]
+meta\ =\ self._download_json(url,\ video_id)
+\f[]
+.fi
+.PP
+Assume at this point \f[C]meta\f[]\[aq]s layout is:
+.IP
+.nf
+\f[C]
+{
+\ \ \ \ ...
+\ \ \ \ "summary":\ "some\ fancy\ summary\ text",
+\ \ \ \ ...
+}
+\f[]
+.fi
+.PP
+Assume you want to extract \f[C]summary\f[] and put it into the
+resulting info dict as \f[C]description\f[].
+Since \f[C]description\f[] is an optional meta field you should be ready
+that this key may be missing from the \f[C]meta\f[] dict, so that you
+should extract it like:
+.IP
+.nf
+\f[C]
+description\ =\ meta.get(\[aq]summary\[aq])\ \ #\ correct
+\f[]
+.fi
+.PP
+and not like:
+.IP
+.nf
+\f[C]
+description\ =\ meta[\[aq]summary\[aq]]\ \ #\ incorrect
+\f[]
+.fi
+.PP
+The latter will break extraction process with \f[C]KeyError\f[] if
+\f[C]summary\f[] disappears from \f[C]meta\f[] at some later time but
+with the former approach extraction will just go ahead with
+\f[C]description\f[] set to \f[C]None\f[] which is perfectly fine
+(remember \f[C]None\f[] is equivalent to the absence of data).
+.PP
+Similarly, you should pass \f[C]fatal=False\f[] when extracting optional
+data from a webpage with \f[C]_search_regex\f[],
+\f[C]_html_search_regex\f[] or similar methods, for instance:
+.IP
+.nf
+\f[C]
+description\ =\ self._search_regex(
+\ \ \ \ r\[aq]<span[^>]+id="title"[^>]*>([^<]+)<\[aq],
+\ \ \ \ webpage,\ \[aq]description\[aq],\ fatal=False)
+\f[]
+.fi
+.PP
+With \f[C]fatal\f[] set to \f[C]False\f[] if \f[C]_search_regex\f[]
+fails to extract \f[C]description\f[] it will emit a warning and
+continue extraction.
+.PP
+You can also pass \f[C]default=<some\ fallback\ value>\f[], for example:
+.IP
+.nf
+\f[C]
+description\ =\ self._search_regex(
+\ \ \ \ r\[aq]<span[^>]+id="title"[^>]*>([^<]+)<\[aq],
+\ \ \ \ webpage,\ \[aq]description\[aq],\ default=None)
+\f[]
+.fi
+.PP
+On failure this code will silently continue the extraction with
+\f[C]description\f[] set to \f[C]None\f[].
+That is useful for metafields that may or may not be present.
+.SS Provide fallbacks
+.PP
+When extracting metadata try to do so from multiple sources.
+For example if \f[C]title\f[] is present in several places, try
+extracting from at least some of them.
+This makes it more future\-proof in case some of the sources become
+unavailable.
+.SS Example
+.PP
+Say \f[C]meta\f[] from the previous example has a \f[C]title\f[] and you
+are about to extract it.
+Since \f[C]title\f[] is a mandatory meta field you should end up with
+something like:
+.IP
+.nf
+\f[C]
+title\ =\ meta[\[aq]title\[aq]]
+\f[]
+.fi
+.PP
+If \f[C]title\f[] disappears from \f[C]meta\f[] in future due to some
+changes on the hoster\[aq]s side the extraction would fail since
+\f[C]title\f[] is mandatory.
+That\[aq]s expected.
+.PP
+Assume that you have some another source you can extract \f[C]title\f[]
+from, for example \f[C]og:title\f[] HTML meta of a \f[C]webpage\f[].
+In this case you can provide a fallback scenario:
+.IP
+.nf
+\f[C]
+title\ =\ meta.get(\[aq]title\[aq])\ or\ self._og_search_title(webpage)
+\f[]
+.fi
+.PP
+This code will try to extract from \f[C]meta\f[] first and if it fails
+it will try extracting \f[C]og:title\f[] from a \f[C]webpage\f[].
+.SS Make regular expressions flexible
+.PP
+When using regular expressions try to write them fuzzy and flexible.
+.SS Example
+.PP
+Say you need to extract \f[C]title\f[] from the following HTML code:
+.IP
+.nf
+\f[C]
+<span\ style="position:\ absolute;\ left:\ 910px;\ width:\ 90px;\ float:\ right;\ z\-index:\ 9999;"\ class="title">some\ fancy\ title</span>
+\f[]
+.fi
+.PP
+The code for that task should look similar to:
+.IP
+.nf
+\f[C]
+title\ =\ self._search_regex(
+\ \ \ \ r\[aq]<span[^>]+class="title"[^>]*>([^<]+)\[aq],\ webpage,\ \[aq]title\[aq])
+\f[]
+.fi
+.PP
+Or even better:
+.IP
+.nf
+\f[C]
+title\ =\ self._search_regex(
+\ \ \ \ r\[aq]<span[^>]+class=(["\\\[aq]])title\\1[^>]*>(?P<title>[^<]+)\[aq],
+\ \ \ \ webpage,\ \[aq]title\[aq],\ group=\[aq]title\[aq])
+\f[]
+.fi
+.PP
+Note how you tolerate potential changes in the \f[C]style\f[]
+attribute\[aq]s value or switch from using double quotes to single for
+\f[C]class\f[] attribute:
+.PP
+The code definitely should not look like:
+.IP
+.nf
+\f[C]
+title\ =\ self._search_regex(
+\ \ \ \ r\[aq]<span\ style="position:\ absolute;\ left:\ 910px;\ width:\ 90px;\ float:\ right;\ z\-index:\ 9999;"\ class="title">(.*?)</span>\[aq],
+\ \ \ \ webpage,\ \[aq]title\[aq],\ group=\[aq]title\[aq])
+\f[]
+.fi
+.SS Use safe conversion functions
+.PP
+Wrap all extracted numeric data into safe functions from \f[C]utils\f[]:
+\f[C]int_or_none\f[], \f[C]float_or_none\f[].
+Use them for string to number conversions as well.
+.SH EMBEDDING YOUTUBE\-DL
+.PP
+youtube\-dl makes the best effort to be a good command\-line program,
+and thus should be callable from any programming language.
+If you encounter any problems parsing its output, feel free to create a
+report (https://github.com/rg3/youtube-dl/issues/new).
+.PP
+From a Python program, you can embed youtube\-dl in a more powerful
+fashion, like this:
+.IP
+.nf
+\f[C]
+from\ __future__\ import\ unicode_literals
+import\ youtube_dl
+
+ydl_opts\ =\ {}
+with\ youtube_dl.YoutubeDL(ydl_opts)\ as\ ydl:
+\ \ \ \ ydl.download([\[aq]http://www.youtube.com/watch?v=BaW_jenozKc\[aq]])
+\f[]
+.fi
+.PP
+Most likely, you\[aq]ll want to use various options.
+For a list of options available, have a look at
+\f[C]youtube_dl/YoutubeDL.py\f[] (https://github.com/rg3/youtube-dl/blob/master/youtube_dl/YoutubeDL.py#L129-L279).
+For a start, if you want to intercept youtube\-dl\[aq]s output, set a
+\f[C]logger\f[] object.
+.PP
+Here\[aq]s a more complete example of a program that outputs only errors
+(and a short message after the download is finished), and
+downloads/converts the video to an mp3 file:
+.IP
+.nf
+\f[C]
+from\ __future__\ import\ unicode_literals
+import\ youtube_dl
+
+
+class\ MyLogger(object):
+\ \ \ \ def\ debug(self,\ msg):
+\ \ \ \ \ \ \ \ pass
+
+\ \ \ \ def\ warning(self,\ msg):
+\ \ \ \ \ \ \ \ pass
+
+\ \ \ \ def\ error(self,\ msg):
+\ \ \ \ \ \ \ \ print(msg)
+
+
+def\ my_hook(d):
+\ \ \ \ if\ d[\[aq]status\[aq]]\ ==\ \[aq]finished\[aq]:
+\ \ \ \ \ \ \ \ print(\[aq]Done\ downloading,\ now\ converting\ ...\[aq])
+
+
+ydl_opts\ =\ {
+\ \ \ \ \[aq]format\[aq]:\ \[aq]bestaudio/best\[aq],
+\ \ \ \ \[aq]postprocessors\[aq]:\ [{
+\ \ \ \ \ \ \ \ \[aq]key\[aq]:\ \[aq]FFmpegExtractAudio\[aq],
+\ \ \ \ \ \ \ \ \[aq]preferredcodec\[aq]:\ \[aq]mp3\[aq],
+\ \ \ \ \ \ \ \ \[aq]preferredquality\[aq]:\ \[aq]192\[aq],
+\ \ \ \ }],
+\ \ \ \ \[aq]logger\[aq]:\ MyLogger(),
+\ \ \ \ \[aq]progress_hooks\[aq]:\ [my_hook],
+}
+with\ youtube_dl.YoutubeDL(ydl_opts)\ as\ ydl:
+\ \ \ \ ydl.download([\[aq]http://www.youtube.com/watch?v=BaW_jenozKc\[aq]])
+\f[]
+.fi
+.SH BUGS
+.PP
+Bugs and suggestions should be reported at:
+<https://github.com/rg3/youtube-dl/issues>.
+Unless you were prompted to or there is another pertinent reason (e.g.
+GitHub fails to accept the bug report), please do not send bug reports
+via personal email.
+For discussions, join us in the IRC channel
+#youtube\-dl (irc://chat.freenode.net/#youtube-dl) on freenode
+(webchat (http://webchat.freenode.net/?randomnick=1&channels=youtube-dl)).
+.PP
+\f[B]Please include the full output of youtube\-dl when run with
+\f[C]\-v\f[]\f[], i.e.
+\f[B]add\f[] \f[C]\-v\f[] flag to \f[B]your command line\f[], copy the
+\f[B]whole\f[] output and post it in the issue body wrapped in ``` for
+better formatting.
+It should look similar to this:
+.IP
+.nf
+\f[C]
+$\ youtube\-dl\ \-v\ <your\ command\ line>
+[debug]\ System\ config:\ []
+[debug]\ User\ config:\ []
+[debug]\ Command\-line\ args:\ [u\[aq]\-v\[aq],\ u\[aq]http://www.youtube.com/watch?v=BaW_jenozKcj\[aq]]
+[debug]\ Encodings:\ locale\ cp1251,\ fs\ mbcs,\ out\ cp866,\ pref\ cp1251
+[debug]\ youtube\-dl\ version\ 2015.12.06
+[debug]\ Git\ HEAD:\ 135392e
+[debug]\ Python\ version\ 2.6.6\ \-\ Windows\-2003Server\-5.2.3790\-SP2
+[debug]\ exe\ versions:\ ffmpeg\ N\-75573\-g1d0487f,\ ffprobe\ N\-75573\-g1d0487f,\ rtmpdump\ 2.4
+[debug]\ Proxy\ map:\ {}
+\&...
+\f[]
+.fi
+.PP
+\f[B]Do not post screenshots of verbose logs; only plain text is
+acceptable.\f[]
+.PP
+The output (including the first lines) contains important debugging
+information.
+Issues without the full output are often not reproducible and therefore
+do not get solved in short order, if ever.
+.PP
+Please re\-read your issue once again to avoid a couple of common
+mistakes (you can and should use this as a checklist):
+.SS Is the description of the issue itself sufficient?
+.PP
+We often get issue reports that we cannot really decipher.
+While in most cases we eventually get the required information after
+asking back multiple times, this poses an unnecessary drain on our
+resources.
+Many contributors, including myself, are also not native speakers, so we
+may misread some parts.
+.PP
+So please elaborate on what feature you are requesting, or what bug you
+want to be fixed.
+Make sure that it\[aq]s obvious
+.IP \[bu] 2
+What the problem is
+.IP \[bu] 2
+How it could be fixed
+.IP \[bu] 2
+How your proposed solution would look like
+.PP
+If your report is shorter than two lines, it is almost certainly missing
+some of these, which makes it hard for us to respond to it.
+We\[aq]re often too polite to close the issue outright, but the missing
+info makes misinterpretation likely.
+As a committer myself, I often get frustrated by these issues, since the
+only possible way for me to move forward on them is to ask for
+clarification over and over.
+.PP
+For bug reports, this means that your report should contain the
+\f[I]complete\f[] output of youtube\-dl when called with the
+\f[C]\-v\f[] flag.
+The error message you get for (most) bugs even says so, but you would
+not believe how many of our bug reports do not contain this information.
+.PP
+If your server has multiple IPs or you suspect censorship, adding
+\f[C]\-\-call\-home\f[] may be a good idea to get more diagnostics.
+If the error is \f[C]ERROR:\ Unable\ to\ extract\ ...\f[] and you cannot
+reproduce it from multiple countries, add \f[C]\-\-dump\-pages\f[]
+(warning: this will yield a rather large output, redirect it to the file
+\f[C]log.txt\f[] by adding \f[C]>log.txt\ 2>&1\f[] to your
+command\-line) or upload the \f[C]\&.dump\f[] files you get when you add
+\f[C]\-\-write\-pages\f[] somewhere (https://gist.github.com/).
+.PP
+\f[B]Site support requests must contain an example URL\f[].
+An example URL is a URL you might want to download, like
+\f[C]http://www.youtube.com/watch?v=BaW_jenozKc\f[].
+There should be an obvious video present.
+Except under very special circumstances, the main page of a video
+service (e.g.
+\f[C]http://www.youtube.com/\f[]) is \f[I]not\f[] an example URL.
+.SS Are you using the latest version?
+.PP
+Before reporting any issue, type \f[C]youtube\-dl\ \-U\f[].
+This should report that you\[aq]re up\-to\-date.
+About 20% of the reports we receive are already fixed, but people are
+using outdated versions.
+This goes for feature requests as well.
+.SS Is the issue already documented?
+.PP
+Make sure that someone has not already opened the issue you\[aq]re
+trying to open.
+Search at the top of the window or browse the GitHub
+Issues (https://github.com/rg3/youtube-dl/search?type=Issues) of this
+repository.
+If there is an issue, feel free to write something along the lines of
+"This affects me as well, with version 2015.01.01.
+Here is some more information on the issue: ...".
+While some issues may be old, a new post into them often spurs rapid
+activity.
+.SS Why are existing options not enough?
+.PP
+Before requesting a new feature, please have a quick peek at the list of
+supported
+options (https://github.com/rg3/youtube-dl/blob/master/README.md#options).
+Many feature requests are for features that actually exist already!
+Please, absolutely do show off your work in the issue report and detail
+how the existing similar options do \f[I]not\f[] solve your problem.
+.SS Is there enough context in your bug report?
+.PP
+People want to solve problems, and often think they do us a favor by
+breaking down their larger problems (e.g.
+wanting to skip already downloaded files) to a specific request (e.g.
+requesting us to look whether the file exists before downloading the
+info page).
+However, what often happens is that they break down the problem into two
+steps: One simple, and one impossible (or extremely complicated one).
+.PP
+We are then presented with a very complicated request when the original
+problem could be solved far easier, e.g.
+by recording the downloaded video IDs in a separate file.
+To avoid this, you must include the greater context where it is
+non\-obvious.
+In particular, every feature request that does not consist of adding
+support for a new site should contain a use case scenario that explains
+in what situation the missing feature would be useful.
+.SS Does the issue involve one problem, and one problem only?
+.PP
+Some of our users seem to think there is a limit of issues they can or
+should open.
+There is no limit of issues they can or should open.
+While it may seem appealing to be able to dump all your issues into one
+ticket, that means that someone who solves one of your issues cannot
+mark the issue as closed.
+Typically, reporting a bunch of issues leads to the ticket lingering
+since nobody wants to attack that behemoth, until someone mercifully
+splits the issue into multiple ones.
+.PP
+In particular, every site support request issue should only pertain to
+services at one site (generally under a common domain, but always using
+the same backend technology).
+Do not request support for vimeo user videos, White house podcasts, and
+Google Plus pages in the same issue.
+Also, make sure that you don\[aq]t post bug reports alongside feature
+requests.
+As a rule of thumb, a feature request does not include outputs of
+youtube\-dl that are not immediately related to the feature at hand.
+Do not post reports of a network error alongside the request for a new
+video service.
+.SS Is anyone going to need the feature?
+.PP
+Only post features that you (or an incapacitated friend you can
+personally talk to) require.
+Do not post features because they seem like a good idea.
+If they are really useful, they will be requested by someone who
+requires them.
+.SS Is your question about youtube\-dl?
+.PP
+It may sound strange, but some bug reports we receive are completely
+unrelated to youtube\-dl and relate to a different, or even the
+reporter\[aq]s own, application.
+Please make sure that you are actually using youtube\-dl.
+If you are using a UI for youtube\-dl, report the bug to the maintainer
+of the actual application providing the UI.
+On the other hand, if your UI for youtube\-dl fails in some way you
+believe is related to youtube\-dl, by all means, go ahead and report the
+bug.
+.SH COPYRIGHT
+.PP
+youtube\-dl is released into the public domain by the copyright holders.
+.PP
+This README file was originally written by Daniel
+Bolton (https://github.com/dbbolton) and is likewise released into the
+public domain.
--- /dev/null
+__youtube_dl()
+{
+ local cur prev opts fileopts diropts keywords
+ COMPREPLY=()
+ cur="${COMP_WORDS[COMP_CWORD]}"
+ prev="${COMP_WORDS[COMP_CWORD-1]}"
+ opts="--help --version --update --ignore-errors --abort-on-error --dump-user-agent --list-extractors --extractor-descriptions --force-generic-extractor --default-search --ignore-config --config-location --flat-playlist --mark-watched --no-mark-watched --no-color --proxy --socket-timeout --source-address --force-ipv4 --force-ipv6 --geo-verification-proxy --cn-verification-proxy --playlist-start --playlist-end --playlist-items --match-title --reject-title --max-downloads --min-filesize --max-filesize --date --datebefore --dateafter --min-views --max-views --match-filter --no-playlist --yes-playlist --age-limit --download-archive --include-ads --limit-rate --retries --fragment-retries --skip-unavailable-fragments --abort-on-unavailable-fragment --buffer-size --no-resize-buffer --test --playlist-reverse --playlist-random --xattr-set-filesize --hls-prefer-native --hls-prefer-ffmpeg --hls-use-mpegts --external-downloader --external-downloader-args --batch-file --id --output --autonumber-size --autonumber-start --restrict-filenames --auto-number --title --literal --no-overwrites --continue --no-continue --no-part --no-mtime --write-description --write-info-json --write-annotations --load-info-json --cookies --cache-dir --no-cache-dir --rm-cache-dir --write-thumbnail --write-all-thumbnails --list-thumbnails --quiet --no-warnings --simulate --skip-download --get-url --get-title --get-id --get-thumbnail --get-description --get-duration --get-filename --get-format --dump-json --dump-single-json --print-json --newline --no-progress --console-title --verbose --dump-pages --write-pages --youtube-print-sig-code --print-traffic --call-home --no-call-home --encoding --no-check-certificate --prefer-insecure --user-agent --referer --add-header --bidi-workaround --sleep-interval --max-sleep-interval --format --all-formats --prefer-free-formats --list-formats --youtube-include-dash-manifest --youtube-skip-dash-manifest --merge-output-format --write-sub --write-auto-sub --all-subs --list-subs --sub-format --sub-lang --username --password --twofactor --netrc --video-password --ap-mso --ap-username --ap-password --ap-list-mso --extract-audio --audio-format --audio-quality --recode-video --postprocessor-args --keep-video --no-post-overwrites --embed-subs --embed-thumbnail --add-metadata --metadata-from-title --xattrs --fixup --prefer-avconv --prefer-ffmpeg --ffmpeg-location --exec --convert-subs"
+ keywords=":ytfavorites :ytrecommended :ytsubscriptions :ytwatchlater :ythistory"
+ fileopts="-a|--batch-file|--download-archive|--cookies|--load-info"
+ diropts="--cache-dir"
+
+ if [[ ${prev} =~ ${fileopts} ]]; then
+ COMPREPLY=( $(compgen -f -- ${cur}) )
+ return 0
+ elif [[ ${prev} =~ ${diropts} ]]; then
+ COMPREPLY=( $(compgen -d -- ${cur}) )
+ return 0
+ fi
+
+ if [[ ${cur} =~ : ]]; then
+ COMPREPLY=( $(compgen -W "${keywords}" -- ${cur}) )
+ return 0
+ elif [[ ${cur} == * ]] ; then
+ COMPREPLY=( $(compgen -W "${opts}" -- ${cur}) )
+ return 0
+ fi
+}
+
+complete -F __youtube_dl youtube-dl
--- /dev/null
+
+complete --command youtube-dl --long-option help --short-option h --description 'Print this help text and exit'
+complete --command youtube-dl --long-option version --description 'Print program version and exit'
+complete --command youtube-dl --long-option update --short-option U --description 'Update this program to latest version. Make sure that you have sufficient permissions (run with sudo if needed)'
+complete --command youtube-dl --long-option ignore-errors --short-option i --description 'Continue on download errors, for example to skip unavailable videos in a playlist'
+complete --command youtube-dl --long-option abort-on-error --description 'Abort downloading of further videos (in the playlist or the command line) if an error occurs'
+complete --command youtube-dl --long-option dump-user-agent --description 'Display the current browser identification'
+complete --command youtube-dl --long-option list-extractors --description 'List all supported extractors'
+complete --command youtube-dl --long-option extractor-descriptions --description 'Output descriptions of all supported extractors'
+complete --command youtube-dl --long-option force-generic-extractor --description 'Force extraction to use the generic extractor'
+complete --command youtube-dl --long-option default-search --description 'Use this prefix for unqualified URLs. For example "gvsearch2:" downloads two videos from google videos for youtube-dl "large apple". Use the value "auto" to let youtube-dl guess ("auto_warning" to emit a warning when guessing). "error" just throws an error. The default value "fixup_error" repairs broken URLs, but emits an error if this is not possible instead of searching.'
+complete --command youtube-dl --long-option ignore-config --description 'Do not read configuration files. When given in the global configuration file /etc/youtube-dl.conf: Do not read the user configuration in ~/.config/youtube-dl/config (%APPDATA%/youtube-dl/config.txt on Windows)'
+complete --command youtube-dl --long-option config-location --description 'Location of the configuration file; either the path to the config or its containing directory.'
+complete --command youtube-dl --long-option flat-playlist --description 'Do not extract the videos of a playlist, only list them.'
+complete --command youtube-dl --long-option mark-watched --description 'Mark videos watched (YouTube only)'
+complete --command youtube-dl --long-option no-mark-watched --description 'Do not mark videos watched (YouTube only)'
+complete --command youtube-dl --long-option no-color --description 'Do not emit color codes in output'
+complete --command youtube-dl --long-option proxy --description 'Use the specified HTTP/HTTPS/SOCKS proxy. To enable experimental SOCKS proxy, specify a proper scheme. For example socks5://127.0.0.1:1080/. Pass in an empty string (--proxy "") for direct connection'
+complete --command youtube-dl --long-option socket-timeout --description 'Time to wait before giving up, in seconds'
+complete --command youtube-dl --long-option source-address --description 'Client-side IP address to bind to'
+complete --command youtube-dl --long-option force-ipv4 --short-option 4 --description 'Make all connections via IPv4'
+complete --command youtube-dl --long-option force-ipv6 --short-option 6 --description 'Make all connections via IPv6'
+complete --command youtube-dl --long-option geo-verification-proxy --description 'Use this proxy to verify the IP address for some geo-restricted sites. The default proxy specified by --proxy (or none, if the options is not present) is used for the actual downloading.'
+complete --command youtube-dl --long-option cn-verification-proxy
+complete --command youtube-dl --long-option playlist-start --description 'Playlist video to start at (default is %default)'
+complete --command youtube-dl --long-option playlist-end --description 'Playlist video to end at (default is last)'
+complete --command youtube-dl --long-option playlist-items --description 'Playlist video items to download. Specify indices of the videos in the playlist separated by commas like: "--playlist-items 1,2,5,8" if you want to download videos indexed 1, 2, 5, 8 in the playlist. You can specify range: "--playlist-items 1-3,7,10-13", it will download the videos at index 1, 2, 3, 7, 10, 11, 12 and 13.'
+complete --command youtube-dl --long-option match-title --description 'Download only matching titles (regex or caseless sub-string)'
+complete --command youtube-dl --long-option reject-title --description 'Skip download for matching titles (regex or caseless sub-string)'
+complete --command youtube-dl --long-option max-downloads --description 'Abort after downloading NUMBER files'
+complete --command youtube-dl --long-option min-filesize --description 'Do not download any videos smaller than SIZE (e.g. 50k or 44.6m)'
+complete --command youtube-dl --long-option max-filesize --description 'Do not download any videos larger than SIZE (e.g. 50k or 44.6m)'
+complete --command youtube-dl --long-option date --description 'Download only videos uploaded in this date'
+complete --command youtube-dl --long-option datebefore --description 'Download only videos uploaded on or before this date (i.e. inclusive)'
+complete --command youtube-dl --long-option dateafter --description 'Download only videos uploaded on or after this date (i.e. inclusive)'
+complete --command youtube-dl --long-option min-views --description 'Do not download any videos with less than COUNT views'
+complete --command youtube-dl --long-option max-views --description 'Do not download any videos with more than COUNT views'
+complete --command youtube-dl --long-option match-filter --description 'Generic video filter. Specify any key (see help for -o for a list of available keys) to match if the key is present, !key to check if the key is not present,key > NUMBER (like "comment_count > 12", also works with >=, <, <=, !=, =) to compare against a number, and & to require multiple matches. Values which are not known are excluded unless you put a question mark (?) after the operator.For example, to only match videos that have been liked more than 100 times and disliked less than 50 times (or the dislike functionality is not available at the given service), but who also have a description, use --match-filter "like_count > 100 & dislike_count <? 50 & description" .'
+complete --command youtube-dl --long-option no-playlist --description 'Download only the video, if the URL refers to a video and a playlist.'
+complete --command youtube-dl --long-option yes-playlist --description 'Download the playlist, if the URL refers to a video and a playlist.'
+complete --command youtube-dl --long-option age-limit --description 'Download only videos suitable for the given age'
+complete --command youtube-dl --long-option download-archive --description 'Download only videos not listed in the archive file. Record the IDs of all downloaded videos in it.' --require-parameter
+complete --command youtube-dl --long-option include-ads --description 'Download advertisements as well (experimental)'
+complete --command youtube-dl --long-option limit-rate --short-option r --description 'Maximum download rate in bytes per second (e.g. 50K or 4.2M)'
+complete --command youtube-dl --long-option retries --short-option R --description 'Number of retries (default is %default), or "infinite".'
+complete --command youtube-dl --long-option fragment-retries --description 'Number of retries for a fragment (default is %default), or "infinite" (DASH and hlsnative only)'
+complete --command youtube-dl --long-option skip-unavailable-fragments --description 'Skip unavailable fragments (DASH and hlsnative only)'
+complete --command youtube-dl --long-option abort-on-unavailable-fragment --description 'Abort downloading when some fragment is not available'
+complete --command youtube-dl --long-option buffer-size --description 'Size of download buffer (e.g. 1024 or 16K) (default is %default)'
+complete --command youtube-dl --long-option no-resize-buffer --description 'Do not automatically adjust the buffer size. By default, the buffer size is automatically resized from an initial value of SIZE.'
+complete --command youtube-dl --long-option test
+complete --command youtube-dl --long-option playlist-reverse --description 'Download playlist videos in reverse order'
+complete --command youtube-dl --long-option playlist-random --description 'Download playlist videos in random order'
+complete --command youtube-dl --long-option xattr-set-filesize --description 'Set file xattribute ytdl.filesize with expected file size (experimental)'
+complete --command youtube-dl --long-option hls-prefer-native --description 'Use the native HLS downloader instead of ffmpeg'
+complete --command youtube-dl --long-option hls-prefer-ffmpeg --description 'Use ffmpeg instead of the native HLS downloader'
+complete --command youtube-dl --long-option hls-use-mpegts --description 'Use the mpegts container for HLS videos, allowing to play the video while downloading (some players may not be able to play it)'
+complete --command youtube-dl --long-option external-downloader --description 'Use the specified external downloader. Currently supports aria2c,avconv,axel,curl,ffmpeg,httpie,wget'
+complete --command youtube-dl --long-option external-downloader-args --description 'Give these arguments to the external downloader'
+complete --command youtube-dl --long-option batch-file --short-option a --description 'File containing URLs to download ('"'"'-'"'"' for stdin)' --require-parameter
+complete --command youtube-dl --long-option id --description 'Use only video ID in file name'
+complete --command youtube-dl --long-option output --short-option o --description 'Output filename template, see the "OUTPUT TEMPLATE" for all the info'
+complete --command youtube-dl --long-option autonumber-size --description 'Specify the number of digits in %(autonumber)s when it is present in output filename template or --auto-number option is given (default is %default)'
+complete --command youtube-dl --long-option autonumber-start --description 'Specify the start value for %(autonumber)s (default is %default)'
+complete --command youtube-dl --long-option restrict-filenames --description 'Restrict filenames to only ASCII characters, and avoid "&" and spaces in filenames'
+complete --command youtube-dl --long-option auto-number --short-option A --description '[deprecated; use -o "%(autonumber)s-%(title)s.%(ext)s" ] Number downloaded files starting from 00000'
+complete --command youtube-dl --long-option title --short-option t --description '[deprecated] Use title in file name (default)'
+complete --command youtube-dl --long-option literal --short-option l --description '[deprecated] Alias of --title'
+complete --command youtube-dl --long-option no-overwrites --short-option w --description 'Do not overwrite files'
+complete --command youtube-dl --long-option continue --short-option c --description 'Force resume of partially downloaded files. By default, youtube-dl will resume downloads if possible.'
+complete --command youtube-dl --long-option no-continue --description 'Do not resume partially downloaded files (restart from beginning)'
+complete --command youtube-dl --long-option no-part --description 'Do not use .part files - write directly into output file'
+complete --command youtube-dl --long-option no-mtime --description 'Do not use the Last-modified header to set the file modification time'
+complete --command youtube-dl --long-option write-description --description 'Write video description to a .description file'
+complete --command youtube-dl --long-option write-info-json --description 'Write video metadata to a .info.json file'
+complete --command youtube-dl --long-option write-annotations --description 'Write video annotations to a .annotations.xml file'
+complete --command youtube-dl --long-option load-info-json --description 'JSON file containing the video information (created with the "--write-info-json" option)'
+complete --command youtube-dl --long-option cookies --description 'File to read cookies from and dump cookie jar in' --require-parameter
+complete --command youtube-dl --long-option cache-dir --description 'Location in the filesystem where youtube-dl can store some downloaded information permanently. By default $XDG_CACHE_HOME/youtube-dl or ~/.cache/youtube-dl . At the moment, only YouTube player files (for videos with obfuscated signatures) are cached, but that may change.'
+complete --command youtube-dl --long-option no-cache-dir --description 'Disable filesystem caching'
+complete --command youtube-dl --long-option rm-cache-dir --description 'Delete all filesystem cache files'
+complete --command youtube-dl --long-option write-thumbnail --description 'Write thumbnail image to disk'
+complete --command youtube-dl --long-option write-all-thumbnails --description 'Write all thumbnail image formats to disk'
+complete --command youtube-dl --long-option list-thumbnails --description 'Simulate and list all available thumbnail formats'
+complete --command youtube-dl --long-option quiet --short-option q --description 'Activate quiet mode'
+complete --command youtube-dl --long-option no-warnings --description 'Ignore warnings'
+complete --command youtube-dl --long-option simulate --short-option s --description 'Do not download the video and do not write anything to disk'
+complete --command youtube-dl --long-option skip-download --description 'Do not download the video'
+complete --command youtube-dl --long-option get-url --short-option g --description 'Simulate, quiet but print URL'
+complete --command youtube-dl --long-option get-title --short-option e --description 'Simulate, quiet but print title'
+complete --command youtube-dl --long-option get-id --description 'Simulate, quiet but print id'
+complete --command youtube-dl --long-option get-thumbnail --description 'Simulate, quiet but print thumbnail URL'
+complete --command youtube-dl --long-option get-description --description 'Simulate, quiet but print video description'
+complete --command youtube-dl --long-option get-duration --description 'Simulate, quiet but print video length'
+complete --command youtube-dl --long-option get-filename --description 'Simulate, quiet but print output filename'
+complete --command youtube-dl --long-option get-format --description 'Simulate, quiet but print output format'
+complete --command youtube-dl --long-option dump-json --short-option j --description 'Simulate, quiet but print JSON information. See --output for a description of available keys.'
+complete --command youtube-dl --long-option dump-single-json --short-option J --description 'Simulate, quiet but print JSON information for each command-line argument. If the URL refers to a playlist, dump the whole playlist information in a single line.'
+complete --command youtube-dl --long-option print-json --description 'Be quiet and print the video information as JSON (video is still being downloaded).'
+complete --command youtube-dl --long-option newline --description 'Output progress bar as new lines'
+complete --command youtube-dl --long-option no-progress --description 'Do not print progress bar'
+complete --command youtube-dl --long-option console-title --description 'Display progress in console titlebar'
+complete --command youtube-dl --long-option verbose --short-option v --description 'Print various debugging information'
+complete --command youtube-dl --long-option dump-pages --description 'Print downloaded pages encoded using base64 to debug problems (very verbose)'
+complete --command youtube-dl --long-option write-pages --description 'Write downloaded intermediary pages to files in the current directory to debug problems'
+complete --command youtube-dl --long-option youtube-print-sig-code
+complete --command youtube-dl --long-option print-traffic --description 'Display sent and read HTTP traffic'
+complete --command youtube-dl --long-option call-home --short-option C --description 'Contact the youtube-dl server for debugging'
+complete --command youtube-dl --long-option no-call-home --description 'Do NOT contact the youtube-dl server for debugging'
+complete --command youtube-dl --long-option encoding --description 'Force the specified encoding (experimental)'
+complete --command youtube-dl --long-option no-check-certificate --description 'Suppress HTTPS certificate validation'
+complete --command youtube-dl --long-option prefer-insecure --description 'Use an unencrypted connection to retrieve information about the video. (Currently supported only for YouTube)'
+complete --command youtube-dl --long-option user-agent --description 'Specify a custom user agent'
+complete --command youtube-dl --long-option referer --description 'Specify a custom referer, use if the video access is restricted to one domain'
+complete --command youtube-dl --long-option add-header --description 'Specify a custom HTTP header and its value, separated by a colon '"'"':'"'"'. You can use this option multiple times'
+complete --command youtube-dl --long-option bidi-workaround --description 'Work around terminals that lack bidirectional text support. Requires bidiv or fribidi executable in PATH'
+complete --command youtube-dl --long-option sleep-interval --description 'Number of seconds to sleep before each download when used alone or a lower bound of a range for randomized sleep before each download (minimum possible number of seconds to sleep) when used along with --max-sleep-interval.'
+complete --command youtube-dl --long-option max-sleep-interval --description 'Upper bound of a range for randomized sleep before each download (maximum possible number of seconds to sleep). Must only be used along with --min-sleep-interval.'
+complete --command youtube-dl --long-option format --short-option f --description 'Video format code, see the "FORMAT SELECTION" for all the info'
+complete --command youtube-dl --long-option all-formats --description 'Download all available video formats'
+complete --command youtube-dl --long-option prefer-free-formats --description 'Prefer free video formats unless a specific one is requested'
+complete --command youtube-dl --long-option list-formats --short-option F --description 'List all available formats of requested videos'
+complete --command youtube-dl --long-option youtube-include-dash-manifest
+complete --command youtube-dl --long-option youtube-skip-dash-manifest --description 'Do not download the DASH manifests and related data on YouTube videos'
+complete --command youtube-dl --long-option merge-output-format --description 'If a merge is required (e.g. bestvideo+bestaudio), output to given container format. One of mkv, mp4, ogg, webm, flv. Ignored if no merge is required'
+complete --command youtube-dl --long-option write-sub --description 'Write subtitle file'
+complete --command youtube-dl --long-option write-auto-sub --description 'Write automatically generated subtitle file (YouTube only)'
+complete --command youtube-dl --long-option all-subs --description 'Download all the available subtitles of the video'
+complete --command youtube-dl --long-option list-subs --description 'List all available subtitles for the video'
+complete --command youtube-dl --long-option sub-format --description 'Subtitle format, accepts formats preference, for example: "srt" or "ass/srt/best"'
+complete --command youtube-dl --long-option sub-lang --description 'Languages of the subtitles to download (optional) separated by commas, use --list-subs for available language tags'
+complete --command youtube-dl --long-option username --short-option u --description 'Login with this account ID'
+complete --command youtube-dl --long-option password --short-option p --description 'Account password. If this option is left out, youtube-dl will ask interactively.'
+complete --command youtube-dl --long-option twofactor --short-option 2 --description 'Two-factor authentication code'
+complete --command youtube-dl --long-option netrc --short-option n --description 'Use .netrc authentication data'
+complete --command youtube-dl --long-option video-password --description 'Video password (vimeo, smotri, youku)'
+complete --command youtube-dl --long-option ap-mso --description 'Adobe Pass multiple-system operator (TV provider) identifier, use --ap-list-mso for a list of available MSOs'
+complete --command youtube-dl --long-option ap-username --description 'Multiple-system operator account login'
+complete --command youtube-dl --long-option ap-password --description 'Multiple-system operator account password. If this option is left out, youtube-dl will ask interactively.'
+complete --command youtube-dl --long-option ap-list-mso --description 'List all supported multiple-system operators'
+complete --command youtube-dl --long-option extract-audio --short-option x --description 'Convert video files to audio-only files (requires ffmpeg or avconv and ffprobe or avprobe)'
+complete --command youtube-dl --long-option audio-format --description 'Specify audio format: "best", "aac", "vorbis", "mp3", "m4a", "opus", or "wav"; "%default" by default; No effect without -x'
+complete --command youtube-dl --long-option audio-quality --description 'Specify ffmpeg/avconv audio quality, insert a value between 0 (better) and 9 (worse) for VBR or a specific bitrate like 128K (default %default)'
+complete --command youtube-dl --long-option recode-video --description 'Encode the video to another format if necessary (currently supported: mp4|flv|ogg|webm|mkv|avi)' --arguments 'mp4 flv ogg webm mkv' --exclusive
+complete --command youtube-dl --long-option postprocessor-args --description 'Give these arguments to the postprocessor'
+complete --command youtube-dl --long-option keep-video --short-option k --description 'Keep the video file on disk after the post-processing; the video is erased by default'
+complete --command youtube-dl --long-option no-post-overwrites --description 'Do not overwrite post-processed files; the post-processed files are overwritten by default'
+complete --command youtube-dl --long-option embed-subs --description 'Embed subtitles in the video (only for mp4, webm and mkv videos)'
+complete --command youtube-dl --long-option embed-thumbnail --description 'Embed thumbnail in the audio as cover art'
+complete --command youtube-dl --long-option add-metadata --description 'Write metadata to the video file'
+complete --command youtube-dl --long-option metadata-from-title --description 'Parse additional metadata like song title / artist from the video title. The format syntax is the same as --output, the parsed parameters replace existing values. Additional templates: %(album)s, %(artist)s. Example: --metadata-from-title "%(artist)s - %(title)s" matches a title like "Coldplay - Paradise"'
+complete --command youtube-dl --long-option xattrs --description 'Write metadata to the video file'"'"'s xattrs (using dublin core and xdg standards)'
+complete --command youtube-dl --long-option fixup --description 'Automatically correct known faults of the file. One of never (do nothing), warn (only emit a warning), detect_or_warn (the default; fix file if we can, warn otherwise)'
+complete --command youtube-dl --long-option prefer-avconv --description 'Prefer avconv over ffmpeg for running the postprocessors (default)'
+complete --command youtube-dl --long-option prefer-ffmpeg --description 'Prefer ffmpeg over avconv for running the postprocessors'
+complete --command youtube-dl --long-option ffmpeg-location --description 'Location of the ffmpeg/avconv binary; either the path to the binary or its containing directory.'
+complete --command youtube-dl --long-option exec --description 'Execute a command on the file after downloading, similar to find'"'"'s -exec syntax. Example: --exec '"'"'adb push {} /sdcard/Music/ && rm {}'"'"''
+complete --command youtube-dl --long-option convert-subs --description 'Convert the subtitles to other format (currently supported: srt|ass|vtt)'
+
+
+complete --command youtube-dl --arguments ":ytfavorites :ytrecommended :ytsubscriptions :ytwatchlater :ythistory"
+++ /dev/null
-# This allows the youtube-dl command to be installed in ZSH using antigen.
-# Antigen is a bundle manager. It allows you to enhance the functionality of
-# your zsh session by installing bundles and themes easily.
-
-# Antigen documentation:
-# http://antigen.sharats.me/
-# https://github.com/zsh-users/antigen
-
-# Install youtube-dl:
-# antigen bundle rg3/youtube-dl
-# Bundles installed by antigen are available for use immediately.
-
-# Update youtube-dl (and all other antigen bundles):
-# antigen update
-
-# The antigen command will download the git repository to a folder and then
-# execute an enabling script (this file). The complete process for loading the
-# code is documented here:
-# https://github.com/zsh-users/antigen#notes-on-writing-plugins
-
-# This specific script just aliases youtube-dl to the python script that this
-# library provides. This requires updating the PYTHONPATH to ensure that the
-# full set of code can be located.
-alias youtube-dl="PYTHONPATH=$(dirname $0) $(dirname $0)/bin/youtube-dl"
--- /dev/null
+#compdef youtube-dl
+
+__youtube_dl() {
+ local curcontext="$curcontext" fileopts diropts cur prev
+ typeset -A opt_args
+ fileopts="--download-archive|-a|--batch-file|--load-info-json|--load-info|--cookies"
+ diropts="--cache-dir"
+ cur=$words[CURRENT]
+ case $cur in
+ :)
+ _arguments '*: :(::ytfavorites ::ytrecommended ::ytsubscriptions ::ytwatchlater ::ythistory)'
+ ;;
+ *)
+ prev=$words[CURRENT-1]
+ if [[ ${prev} =~ ${fileopts} ]]; then
+ _path_files
+ elif [[ ${prev} =~ ${diropts} ]]; then
+ _path_files -/
+ elif [[ ${prev} == "--recode-video" ]]; then
+ _arguments '*: :(mp4 flv ogg webm mkv)'
+ else
+ _arguments '*: :(--help --version --update --ignore-errors --abort-on-error --dump-user-agent --list-extractors --extractor-descriptions --force-generic-extractor --default-search --ignore-config --config-location --flat-playlist --mark-watched --no-mark-watched --no-color --proxy --socket-timeout --source-address --force-ipv4 --force-ipv6 --geo-verification-proxy --cn-verification-proxy --playlist-start --playlist-end --playlist-items --match-title --reject-title --max-downloads --min-filesize --max-filesize --date --datebefore --dateafter --min-views --max-views --match-filter --no-playlist --yes-playlist --age-limit --download-archive --include-ads --limit-rate --retries --fragment-retries --skip-unavailable-fragments --abort-on-unavailable-fragment --buffer-size --no-resize-buffer --test --playlist-reverse --playlist-random --xattr-set-filesize --hls-prefer-native --hls-prefer-ffmpeg --hls-use-mpegts --external-downloader --external-downloader-args --batch-file --id --output --autonumber-size --autonumber-start --restrict-filenames --auto-number --title --literal --no-overwrites --continue --no-continue --no-part --no-mtime --write-description --write-info-json --write-annotations --load-info-json --cookies --cache-dir --no-cache-dir --rm-cache-dir --write-thumbnail --write-all-thumbnails --list-thumbnails --quiet --no-warnings --simulate --skip-download --get-url --get-title --get-id --get-thumbnail --get-description --get-duration --get-filename --get-format --dump-json --dump-single-json --print-json --newline --no-progress --console-title --verbose --dump-pages --write-pages --youtube-print-sig-code --print-traffic --call-home --no-call-home --encoding --no-check-certificate --prefer-insecure --user-agent --referer --add-header --bidi-workaround --sleep-interval --max-sleep-interval --format --all-formats --prefer-free-formats --list-formats --youtube-include-dash-manifest --youtube-skip-dash-manifest --merge-output-format --write-sub --write-auto-sub --all-subs --list-subs --sub-format --sub-lang --username --password --twofactor --netrc --video-password --ap-mso --ap-username --ap-password --ap-list-mso --extract-audio --audio-format --audio-quality --recode-video --postprocessor-args --keep-video --no-post-overwrites --embed-subs --embed-thumbnail --add-metadata --metadata-from-title --xattrs --fixup --prefer-avconv --prefer-ffmpeg --ffmpeg-location --exec --convert-subs)'
+ fi
+ ;;
+ esac
+}
+
+__youtube_dl
\ No newline at end of file
import time
import tokenize
import traceback
+import random
from .compat import (
compat_basestring,
playlistend: Playlist item to end at.
playlist_items: Specific indices of playlist to download.
playlistreverse: Download playlist items in reverse order.
+ playlistrandom: Download playlist items in random order.
matchtitle: Download only matching titles.
rejecttitle: Reject downloads for matching titles.
logger: Log messages to a logging.Logger instance.
if autonumber_size is None:
autonumber_size = 5
autonumber_templ = '%0' + str(autonumber_size) + 'd'
- template_dict['autonumber'] = autonumber_templ % self._num_downloads
+ template_dict['autonumber'] = autonumber_templ % (self.params.get('autonumber_start', 1) - 1 + self._num_downloads)
if template_dict.get('playlist_index') is not None:
template_dict['playlist_index'] = '%0*d' % (len(str(template_dict['n_entries'])), template_dict['playlist_index'])
if template_dict.get('resolution') is None:
if self.params.get('playlistreverse', False):
entries = entries[::-1]
+ if self.params.get('playlistrandom', False):
+ random.shuffle(entries)
+
for i, entry in enumerate(entries, 1):
self.to_screen('[download] Downloading video %s of %s' % (i, n_entries))
extra = {
format['format_id'] = compat_str(i)
else:
# Sanitize format_id from characters used in format selector expression
- format['format_id'] = re.sub('[\s,/+\[\]()]', '_', format['format_id'])
+ format['format_id'] = re.sub(r'[\s,/+\[\]()]', '_', format['format_id'])
format_id = format['format_id']
if format_id not in formats_dict:
formats_dict[format_id] = []
format['ext'] = determine_ext(format['url']).lower()
# Automatically determine protocol if missing (useful for format
# selection purposes)
- if 'protocol' not in format:
+ if format.get('protocol') is None:
format['protocol'] = determine_protocol(format)
# Add HTTP headers, so that external programs can use them from the
# json output
parser.error('TV Provider account username missing\n')
if opts.outtmpl is not None and (opts.usetitle or opts.autonumber or opts.useid):
parser.error('using output template conflicts with using title, video ID or auto number')
+ if opts.autonumber_size is not None:
+ if opts.autonumber_size <= 0:
+ parser.error('auto number size must be positive')
+ if opts.autonumber_start is not None:
+ if opts.autonumber_start < 0:
+ parser.error('auto number start must be positive or 0')
if opts.usetitle and opts.useid:
parser.error('using title conflicts with using video ID')
if opts.username is not None and opts.password is None:
'listformats': opts.listformats,
'outtmpl': outtmpl,
'autonumber_size': opts.autonumber_size,
+ 'autonumber_start': opts.autonumber_start,
'restrictfilenames': opts.restrictfilenames,
'ignoreerrors': opts.ignoreerrors,
'force_generic_extractor': opts.force_generic_extractor,
'playliststart': opts.playliststart,
'playlistend': opts.playlistend,
'playlistreverse': opts.playlist_reverse,
+ 'playlistrandom': opts.playlist_random,
'noplaylist': opts.noplaylist,
'logtostderr': opts.outtmpl == '-',
'consoletitle': opts.consoletitle,
'postprocessor_args': postprocessor_args,
'cn_verification_proxy': opts.cn_verification_proxy,
'geo_verification_proxy': opts.geo_verification_proxy,
-
+ 'config_location': opts.config_location,
}
with YoutubeDL(ydl_opts) as ydl:
from urllib.parse import unquote_plus as compat_urllib_parse_unquote_plus
except ImportError: # Python 2
_asciire = (compat_urllib_parse._asciire if hasattr(compat_urllib_parse, '_asciire')
- else re.compile('([\x00-\x7f]+)'))
+ else re.compile(r'([\x00-\x7f]+)'))
# HACK: The following are the correct unquote_to_bytes, unquote and unquote_plus
# implementations from cpython 3.4.3's stdlib. Python 2's version
el.text = el.text.decode('utf-8')
return doc
+if hasattr(etree, 'register_namespace'):
+ compat_etree_register_namespace = etree.register_namespace
+else:
+ def compat_etree_register_namespace(prefix, uri):
+ """Register a namespace prefix.
+ The registry is global, and any existing mapping for either the
+ given prefix or the namespace URI will be removed.
+ *prefix* is the namespace prefix, *uri* is a namespace uri. Tags and
+ attributes in this namespace will be serialized with prefix if possible.
+ ValueError is raised if prefix is reserved or is invalid.
+ """
+ if re.match(r"ns\d+$", prefix):
+ raise ValueError("Prefix format reserved for internal use")
+ for k, v in list(etree._namespace_map.items()):
+ if k == uri or v == prefix:
+ del etree._namespace_map[k]
+ etree._namespace_map[uri] = prefix
+
if sys.version_info < (2, 7):
# Here comes the crazy part: In 2.6, if the xpath is a unicode,
# .//node does not match if a node is a direct child of . !
'compat_cookiejar',
'compat_cookies',
'compat_etree_fromstring',
+ 'compat_etree_register_namespace',
'compat_expanduser',
'compat_get_terminal_size',
'compat_getenv',
encodeArgument,
handle_youtubedl_headers,
check_executable,
+ is_outdated_version,
)
args = [ffpp.executable, '-y']
+ seekable = info_dict.get('_seekable')
+ if seekable is not None:
+ # setting -seekable prevents ffmpeg from guessing if the server
+ # supports seeking(by adding the header `Range: bytes=0-`), which
+ # can cause problems in some cases
+ # https://github.com/rg3/youtube-dl/issues/11800#issuecomment-275037127
+ # http://trac.ffmpeg.org/ticket/6125#comment:10
+ args += ['-seekable', '1' if seekable else '0']
+
args += self._configuration_args()
# start_time = info_dict.get('start_time') or 0
if self.params.get('hls_use_mpegts', False) or tmpfilename == '-':
args += ['-f', 'mpegts']
else:
- args += ['-f', 'mp4', '-bsf:a', 'aac_adtstoasc']
+ args += ['-f', 'mp4']
+ if (ffpp.basename == 'ffmpeg' and is_outdated_version(ffpp._versions['ffmpeg'], '3.2')) and (not info_dict.get('acodec') or info_dict['acodec'].split('.')[0] in ('aac', 'mp4a')):
+ args += ['-bsf:a', 'aac_adtstoasc']
elif protocol == 'rtmp':
args += ['-f', 'flv']
else:
'noprogress': True,
'ratelimit': self.params.get('ratelimit'),
'retries': self.params.get('retries', 0),
+ 'nopart': self.params.get('nopart', False),
'test': self.params.get('test', False),
}
)
s = manifest.decode('utf-8', 'ignore')
if not self.can_download(s, info_dict):
+ if info_dict.get('extra_param_to_segment_url'):
+ self.report_error('pycrypto not found. Please install it.')
+ return False
self.report_warning(
'hlsnative has detected features it does not support, '
'extraction will be delegated to ffmpeg')
'title': '\'This Week\' Exclusive: Iran\'s Foreign Minister Zarif',
'description': 'George Stephanopoulos goes one-on-one with Iranian Foreign Minister Dr. Javad Zarif.',
'duration': 180,
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
},
'params': {
# m3u8 download
'display_id': 'dramatic-video-rare-death-job-america',
'title': 'Occupational Hazards',
'description': 'Nightline investigates the dangers that lurk at various jobs.',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'upload_date': '20100428',
'timestamp': 1272412800,
},
'ext': 'mp4',
'title': 'East Bay museum celebrates vintage synthesizers',
'description': 'md5:a4f10fb2f2a02565c1749d4adbab4b10',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'timestamp': 1421123075,
'upload_date': '20150113',
'uploader': 'Jonathan Bloom',
from ..compat import compat_str
from ..utils import (
int_or_none,
+ parse_iso8601,
OnDemandPagedList,
)
class ACastIE(InfoExtractor):
IE_NAME = 'acast'
_VALID_URL = r'https?://(?:www\.)?acast\.com/(?P<channel>[^/]+)/(?P<id>[^/#?]+)'
- _TEST = {
+ _TESTS = [{
+ # test with one bling
'url': 'https://www.acast.com/condenasttraveler/-where-are-you-taipei-101-taiwan',
'md5': 'ada3de5a1e3a2a381327d749854788bb',
'info_dict': {
'id': '57de3baa-4bb0-487e-9418-2692c1277a34',
'ext': 'mp3',
'title': '"Where Are You?": Taipei 101, Taiwan',
- 'timestamp': 1196172000000,
+ 'timestamp': 1196172000,
+ 'upload_date': '20071127',
'description': 'md5:a0b4ef3634e63866b542e5b1199a1a0e',
'duration': 211,
}
- }
+ }, {
+ # test with multiple blings
+ 'url': 'https://www.acast.com/sparpodcast/2.raggarmordet-rosterurdetforflutna',
+ 'md5': '55c0097badd7095f494c99a172f86501',
+ 'info_dict': {
+ 'id': '2a92b283-1a75-4ad8-8396-499c641de0d9',
+ 'ext': 'mp3',
+ 'title': '2. Raggarmordet - Röster ur det förflutna',
+ 'timestamp': 1477346700,
+ 'upload_date': '20161024',
+ 'description': 'md5:4f81f6d8cf2e12ee21a321d8bca32db4',
+ 'duration': 2797,
+ }
+ }]
def _real_extract(self, url):
channel, display_id = re.match(self._VALID_URL, url).groups()
return {
'id': compat_str(cast_data['id']),
'display_id': display_id,
- 'url': cast_data['blings'][0]['audio'],
+ 'url': [b['audio'] for b in cast_data['blings'] if b['type'] == 'BlingAudio'][0],
'title': cast_data['name'],
'description': cast_data.get('description'),
'thumbnail': cast_data.get('image'),
- 'timestamp': int_or_none(cast_data.get('publishingDate')),
+ 'timestamp': parse_iso8601(cast_data.get('publishingDate')),
'duration': int_or_none(cast_data.get('duration')),
}
'ext': 'mp4',
'title': 'Quick Tip - How to Draw a Circle Around an Object in Photoshop',
'description': 'md5:99ec318dc909d7ba2a1f2b038f7d2311',
- 'thumbnail': 're:https?://.*\.jpg$',
+ 'thumbnail': r're:https?://.*\.jpg$',
'upload_date': '20110914',
'duration': 60,
'view_count': int,
_VALID_URL = r'https?://(?:www\.)?(?P<domain>(?:history|aetv|mylifetime)\.com|fyi\.tv)/(?:shows/(?P<show_path>[^/]+(?:/[^/]+){0,2})|movies/(?P<movie_display_id>[^/]+)/full-movie)'
_TESTS = [{
'url': 'http://www.history.com/shows/mountain-men/season-1/episode-1',
- 'md5': '8ff93eb073449f151d6b90c0ae1ef0c7',
+ 'md5': 'a97a65f7e823ae10e9244bc5433d5fe6',
'info_dict': {
'id': '22253814',
'ext': 'mp4',
self._html_search_meta('aetn:SeriesTitle', webpage))
elif url_parts_len == 2:
entries = []
- for episode_item in re.findall(r'(?s)<div[^>]+class="[^"]*episode-item[^"]*"[^>]*>', webpage):
+ for episode_item in re.findall(r'(?s)<[^>]+class="[^"]*(?:episode|program)-item[^"]*"[^>]*>', webpage):
episode_attributes = extract_attributes(episode_item)
episode_url = compat_urlparse.urljoin(
url, episode_attributes['data-canonical'])
query = {
'mbr': 'true',
- 'assetTypes': 'medium_video_s3'
+ 'assetTypes': 'high_video_s3'
}
video_id = self._html_search_meta('aetn:VideoID', webpage)
media_url = self._search_regex(
'id': 'world-war-i-history',
'title': 'World War I History',
},
- 'playlist_mincount': 24,
+ 'playlist_mincount': 23,
}, {
'url': 'http://www.history.com/topics/world-war-i-history/videos',
'only_matching': True,
return self.theplatform_url_result(
release_url, video_id, {
'mbr': 'true',
- 'switch': 'hls'
+ 'switch': 'hls',
+ 'assetTypes': 'high_video_ak',
})
else:
webpage = self._download_webpage(url, topic_id)
entries.append(self.theplatform_url_result(
video_attributes['data-release-url'], video_attributes['data-id'], {
'mbr': 'true',
- 'switch': 'hls'
+ 'switch': 'hls',
+ 'assetTypes': 'high_video_ak',
}))
return self.playlist_result(entries, topic_id, get_element_by_attribute('class', 'show-title', webpage))
class AfreecaTVIE(InfoExtractor):
+ IE_NAME = 'afreecatv'
IE_DESC = 'afreecatv.com'
_VALID_URL = r'''(?x)
https?://
expected=True)
return info
+
+
+class AfreecaTVGlobalIE(AfreecaTVIE):
+ IE_NAME = 'afreecatv:global'
+ _VALID_URL = r'https?://(?:www\.)?afreeca\.tv/(?P<channel_id>\d+)(?:/v/(?P<video_id>\d+))?'
+ _TESTS = [{
+ 'url': 'http://afreeca.tv/36853014/v/58301',
+ 'info_dict': {
+ 'id': '58301',
+ 'title': 'tryhard top100',
+ 'uploader_id': '36853014',
+ 'uploader': 'makgi Hearthstone Live!',
+ },
+ 'playlist_count': 3,
+ }]
+
+ def _real_extract(self, url):
+ channel_id, video_id = re.match(self._VALID_URL, url).groups()
+ video_type = 'video' if video_id else 'live'
+ query = {
+ 'pt': 'view',
+ 'bid': channel_id,
+ }
+ if video_id:
+ query['vno'] = video_id
+ video_data = self._download_json(
+ 'http://api.afreeca.tv/%s/view_%s.php' % (video_type, video_type),
+ video_id or channel_id, query=query)['channel']
+
+ if video_data.get('result') != 1:
+ raise ExtractorError('%s said: %s' % (self.IE_NAME, video_data['remsg']))
+
+ title = video_data['title']
+
+ info = {
+ 'thumbnail': video_data.get('thumb'),
+ 'view_count': int_or_none(video_data.get('vcnt')),
+ 'age_limit': int_or_none(video_data.get('grade')),
+ 'uploader_id': channel_id,
+ 'uploader': video_data.get('cname'),
+ }
+
+ if video_id:
+ entries = []
+ for i, f in enumerate(video_data.get('flist', [])):
+ video_key = self.parse_video_key(f.get('key', ''))
+ f_url = f.get('file')
+ if not video_key or not f_url:
+ continue
+ entries.append({
+ 'id': '%s_%s' % (video_id, video_key.get('part', i + 1)),
+ 'title': title,
+ 'upload_date': video_key.get('upload_date'),
+ 'duration': int_or_none(f.get('length')),
+ 'url': f_url,
+ 'protocol': 'm3u8_native',
+ 'ext': 'mp4',
+ })
+
+ info.update({
+ 'id': video_id,
+ 'title': title,
+ 'duration': int_or_none(video_data.get('length')),
+ })
+ if len(entries) > 1:
+ info['_type'] = 'multi_video'
+ info['entries'] = entries
+ elif len(entries) == 1:
+ i = entries[0].copy()
+ i.update(info)
+ info = i
+ else:
+ formats = []
+ for s in video_data.get('strm', []):
+ s_url = s.get('purl')
+ if not s_url:
+ continue
+ stype = s.get('stype')
+ if stype == 'HLS':
+ formats.extend(self._extract_m3u8_formats(
+ s_url, channel_id, 'mp4', m3u8_id=stype, fatal=False))
+ elif stype == 'RTMP':
+ format_id = [stype]
+ label = s.get('label')
+ if label:
+ format_id.append(label)
+ formats.append({
+ 'format_id': '-'.join(format_id),
+ 'url': s_url,
+ 'tbr': int_or_none(s.get('bps')),
+ 'height': int_or_none(s.get('brt')),
+ 'ext': 'flv',
+ 'rtmp_live': True,
+ })
+ self._sort_formats(formats)
+
+ info.update({
+ 'id': channel_id,
+ 'title': self._live_title(title),
+ 'is_live': True,
+ 'formats': formats,
+ })
+
+ return info
'id': '6x4q2w',
'ext': 'mp4',
'title': 'Privacy Lab - a meetup for privacy minded people in San Francisco',
- 'thumbnail': 're:https?://vid\.ly/(?P<id>[0-9a-z-]+)/poster',
+ 'thumbnail': r're:https?://vid\.ly/(?P<id>[0-9a-z-]+)/poster',
'description': 'Brings together privacy professionals and others interested in privacy at for-profits, non-profits, and NGOs in an effort to contribute to the state of the ecosystem...',
'timestamp': 1422487800,
'upload_date': '20150128',
'ext': 'mp4',
'title': 'Astérix - Le Domaine des Dieux Teaser VF',
'description': 'md5:4a754271d9c6f16c72629a8a993ee884',
- 'thumbnail': 're:http://.*\.jpg',
+ 'thumbnail': r're:http://.*\.jpg',
},
}, {
'url': 'http://www.allocine.fr/video/player_gen_cmedia=19540403&cfilm=222257.html',
'ext': 'mp4',
'title': 'Planes 2 Bande-annonce VF',
'description': 'Regardez la bande annonce du film Planes 2 (Planes 2 Bande-annonce VF). Planes 2, un film de Roberts Gannaway',
- 'thumbnail': 're:http://.*\.jpg',
+ 'thumbnail': r're:http://.*\.jpg',
},
}, {
'url': 'http://www.allocine.fr/video/player_gen_cmedia=19544709&cfilm=181290.html',
'ext': 'mp4',
'title': 'Dragons 2 - Bande annonce finale VF',
'description': 'md5:6cdd2d7c2687d4c6aafe80a35e17267a',
- 'thumbnail': 're:http://.*\.jpg',
+ 'thumbnail': r're:http://.*\.jpg',
},
}, {
'url': 'http://www.allocine.fr/video/video-19550147/',
'ext': 'mp4',
'title': 'Faux Raccord N°123 - Les gaffes de Cliffhanger',
'description': 'md5:bc734b83ffa2d8a12188d9eb48bb6354',
- 'thumbnail': 're:http://.*\.jpg',
+ 'thumbnail': r're:http://.*\.jpg',
},
}]
'display_id': 'sensual-striptease-porn-with-samantha-alexandra',
'ext': 'mp4',
'title': 'Sensual striptease porn with Samantha Alexandra',
- 'thumbnail': 're:https?://.*\.jpg$',
+ 'thumbnail': r're:https?://.*\.jpg$',
'timestamp': 1418694611,
'upload_date': '20141216',
'duration': 387,
class AolIE(InfoExtractor):
IE_NAME = 'on.aol.com'
- _VALID_URL = r'(?:aol-video:|https?://on\.aol\.com/(?:[^/]+/)*(?:[^/?#&]+-)?)(?P<id>[^/?#&]+)'
+ _VALID_URL = r'(?:aol-video:|https?://(?:(?:www|on)\.)?aol\.com/(?:[^/]+/)*(?:[^/?#&]+-)?)(?P<id>[^/?#&]+)'
_TESTS = [{
# video with 5min ID
}
}, {
# video with vidible ID
- 'url': 'http://on.aol.com/video/netflix-is-raising-rates-5707d6b8e4b090497b04f706?context=PC:homepage:PL1944:1460189336183',
+ 'url': 'http://www.aol.com/video/view/netflix-is-raising-rates/5707d6b8e4b090497b04f706/',
'info_dict': {
'id': '5707d6b8e4b090497b04f706',
'ext': 'mp4',
'uploader': video_data.get('videoOwner'),
'formats': formats,
}
-
-
-class AolFeaturesIE(InfoExtractor):
- IE_NAME = 'features.aol.com'
- _VALID_URL = r'https?://features\.aol\.com/video/(?P<id>[^/?#]+)'
-
- _TESTS = [{
- 'url': 'http://features.aol.com/video/behind-secret-second-careers-late-night-talk-show-hosts',
- 'md5': '7db483bb0c09c85e241f84a34238cc75',
- 'info_dict': {
- 'id': '519507715',
- 'ext': 'mp4',
- 'title': 'What To Watch - February 17, 2016',
- },
- 'add_ie': ['FiveMin'],
- 'params': {
- # encrypted m3u8 download
- 'skip_download': True,
- },
- }]
-
- def _real_extract(self, url):
- display_id = self._match_id(url)
- webpage = self._download_webpage(url, display_id)
- return self.url_result(self._search_regex(
- r'<script type="text/javascript" src="(https?://[^/]*?5min\.com/Scripts/PlayerSeed\.js[^"]+)"',
- webpage, '5min embed url'), 'FiveMin')
'duration': 2600,
'title': 'Die Story im Ersten: Mission unter falscher Flagge',
'upload_date': '20140804',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
},
'skip': 'HTTP Error 404: Not Found',
}
import re
from .common import InfoExtractor
+from ..compat import compat_urlparse
from ..utils import (
determine_ext,
+ ExtractorError,
float_or_none,
int_or_none,
mimetype2ext,
class ArkenaIE(InfoExtractor):
- _VALID_URL = r'https?://play\.arkena\.com/(?:config|embed)/avp/v\d/player/media/(?P<id>[^/]+)/[^/]+/(?P<account_id>\d+)'
+ _VALID_URL = r'''(?x)
+ https?://
+ (?:
+ video\.arkena\.com/play2/embed/player\?|
+ play\.arkena\.com/(?:config|embed)/avp/v\d/player/media/(?P<id>[^/]+)/[^/]+/(?P<account_id>\d+)
+ )
+ '''
_TESTS = [{
'url': 'https://play.arkena.com/embed/avp/v2/player/media/b41dda37-d8e7-4d3f-b1b5-9a9db578bdfe/1/129411',
'md5': 'b96f2f71b359a8ecd05ce4e1daa72365',
}, {
'url': 'http://play.arkena.com/embed/avp/v1/player/media/327336/darkmatter/131064/',
'only_matching': True,
+ }, {
+ 'url': 'http://video.arkena.com/play2/embed/player?accountId=472718&mediaId=35763b3b-00090078-bf604299&pageStyling=styled',
+ 'only_matching': True,
}]
@staticmethod
video_id = mobj.group('id')
account_id = mobj.group('account_id')
+ # Handle http://video.arkena.com/play2/embed/player URL
+ if not video_id:
+ qs = compat_urlparse.parse_qs(compat_urlparse.urlparse(url).query)
+ video_id = qs.get('mediaId', [None])[0]
+ account_id = qs.get('accountId', [None])[0]
+ if not video_id or not account_id:
+ raise ExtractorError('Invalid URL', expected=True)
+
playlist = self._download_json(
'https://play.arkena.com/config/avp/v2/player/media/%s/0/%s/?callbackMethod=_'
% (video_id, account_id),
'title': 'Especial Solidario de Nochebuena',
'description': 'md5:e2d52ff12214fa937107d21064075bf1',
'duration': 5527.6,
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
},
'skip': 'This video is only available for registered users'
},
'title': 'David Bustamante',
'description': 'md5:f33f1c0a05be57f6708d4dd83a3b81c6',
'duration': 1439.0,
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
},
},
{
'ext': 'flv',
'title': 'AT&T Archives : The UNIX System: Making Computers Easier to Use',
'description': 'A 1982 film about UNIX is the foundation for software in use around Bell Labs and AT&T.',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'upload_date': '20140127',
},
'params': {
'description': 'Guest: Nate Davis - NFL free agency, Guest: Stan Gans',
'duration': 2245.72,
'uploader': 'Steve Czaban',
- 'uploader_url': 're:https?://(?:www\.)?audioboom\.com/channel/steveczabanyahoosportsradio',
+ 'uploader_url': r're:https?://(?:www\.)?audioboom\.com/channel/steveczabanyahoosportsradio',
}
}, {
'url': 'https://audioboom.com/posts/4279833-3-09-2016-czaban-hour-3?t=0',
--- /dev/null
+from __future__ import unicode_literals
+
+import re
+
+from .common import InfoExtractor
+from .kaltura import KalturaIE
+from ..utils import (
+ get_element_by_id,
+ strip_or_none,
+ urljoin,
+)
+
+
+class AZMedienBaseIE(InfoExtractor):
+ def _kaltura_video(self, partner_id, entry_id):
+ return self.url_result(
+ 'kaltura:%s:%s' % (partner_id, entry_id), ie=KalturaIE.ie_key(),
+ video_id=entry_id)
+
+
+class AZMedienIE(AZMedienBaseIE):
+ IE_DESC = 'AZ Medien videos'
+ _VALID_URL = r'''(?x)
+ https?://
+ (?:www\.)?
+ (?:
+ telezueri\.ch|
+ telebaern\.tv|
+ telem1\.ch
+ )/
+ [0-9]+-show-[^/\#]+
+ (?:
+ /[0-9]+-episode-[^/\#]+
+ (?:
+ /[0-9]+-segment-(?:[^/\#]+\#)?|
+ \#
+ )|
+ \#
+ )
+ (?P<id>[^\#]+)
+ '''
+
+ _TESTS = [{
+ # URL with 'segment'
+ 'url': 'http://www.telezueri.ch/62-show-zuerinews/13772-episode-sonntag-18-dezember-2016/32419-segment-massenabweisungen-beim-hiltl-club-wegen-pelzboom',
+ 'info_dict': {
+ 'id': '1_2444peh4',
+ 'ext': 'mov',
+ 'title': 'Massenabweisungen beim Hiltl Club wegen Pelzboom',
+ 'description': 'md5:9ea9dd1b159ad65b36ddcf7f0d7c76a8',
+ 'uploader_id': 'TeleZ?ri',
+ 'upload_date': '20161218',
+ 'timestamp': 1482084490,
+ },
+ 'params': {
+ 'skip_download': True,
+ },
+ }, {
+ # URL with 'segment' and fragment:
+ 'url': 'http://www.telebaern.tv/118-show-news/14240-episode-dienstag-17-januar-2017/33666-segment-achtung-gefahr#zu-wenig-pflegerinnen-und-pfleger',
+ 'only_matching': True
+ }, {
+ # URL with 'episode' and fragment:
+ 'url': 'http://www.telem1.ch/47-show-sonntalk/13986-episode-soldaten-fuer-grenzschutz-energiestrategie-obama-bilanz#soldaten-fuer-grenzschutz-energiestrategie-obama-bilanz',
+ 'only_matching': True
+ }, {
+ # URL with 'show' and fragment:
+ 'url': 'http://www.telezueri.ch/66-show-sonntalk#burka-plakate-trump-putin-china-besuch',
+ 'only_matching': True
+ }]
+
+ def _real_extract(self, url):
+ video_id = self._match_id(url)
+
+ webpage = self._download_webpage(url, video_id)
+
+ partner_id = self._search_regex(
+ r'<script[^>]+src=["\'](?:https?:)?//(?:[^/]+\.)?kaltura\.com(?:/[^/]+)*/(?:p|partner_id)/([0-9]+)',
+ webpage, 'kaltura partner id')
+ entry_id = self._html_search_regex(
+ r'<a[^>]+data-id=(["\'])(?P<id>(?:(?!\1).)+)\1[^>]+data-slug=["\']%s'
+ % re.escape(video_id), webpage, 'kaltura entry id', group='id')
+
+ return self._kaltura_video(partner_id, entry_id)
+
+
+class AZMedienPlaylistIE(AZMedienBaseIE):
+ IE_DESC = 'AZ Medien playlists'
+ _VALID_URL = r'''(?x)
+ https?://
+ (?:www\.)?
+ (?:
+ telezueri\.ch|
+ telebaern\.tv|
+ telem1\.ch
+ )/
+ (?P<id>[0-9]+-
+ (?:
+ show|
+ topic|
+ themen
+ )-[^/\#]+
+ (?:
+ /[0-9]+-episode-[^/\#]+
+ )?
+ )$
+ '''
+
+ _TESTS = [{
+ # URL with 'episode'
+ 'url': 'http://www.telebaern.tv/118-show-news/13735-episode-donnerstag-15-dezember-2016',
+ 'info_dict': {
+ 'id': '118-show-news/13735-episode-donnerstag-15-dezember-2016',
+ 'title': 'News - Donnerstag, 15. Dezember 2016',
+ },
+ 'playlist_count': 9,
+ }, {
+ # URL with 'themen'
+ 'url': 'http://www.telem1.ch/258-themen-tele-m1-classics',
+ 'info_dict': {
+ 'id': '258-themen-tele-m1-classics',
+ 'title': 'Tele M1 Classics',
+ },
+ 'playlist_mincount': 15,
+ }, {
+ # URL with 'topic', contains nested playlists
+ 'url': 'http://www.telezueri.ch/219-topic-aera-trump-hat-offiziell-begonnen',
+ 'only_matching': True,
+ }, {
+ # URL with 'show' only
+ 'url': 'http://www.telezueri.ch/86-show-talktaeglich',
+ 'only_matching': True
+ }]
+
+ def _real_extract(self, url):
+ show_id = self._match_id(url)
+ webpage = self._download_webpage(url, show_id)
+
+ entries = []
+
+ partner_id = self._search_regex(
+ r'src=["\'](?:https?:)?//(?:[^/]+\.)kaltura\.com/(?:[^/]+/)*(?:p|partner_id)/(\d+)',
+ webpage, 'kaltura partner id', default=None)
+
+ if partner_id:
+ entries = [
+ self._kaltura_video(partner_id, m.group('id'))
+ for m in re.finditer(
+ r'data-id=(["\'])(?P<id>(?:(?!\1).)+)\1', webpage)]
+
+ if not entries:
+ entries = [
+ self.url_result(m.group('url'), ie=AZMedienIE.ie_key())
+ for m in re.finditer(
+ r'<a[^>]+data-real=(["\'])(?P<url>http.+?)\1', webpage)]
+
+ if not entries:
+ entries = [
+ # May contain nested playlists (e.g. [1]) thus no explicit
+ # ie_key
+ # 1. http://www.telezueri.ch/219-topic-aera-trump-hat-offiziell-begonnen)
+ self.url_result(urljoin(url, m.group('url')))
+ for m in re.finditer(
+ r'<a[^>]+name=[^>]+href=(["\'])(?P<url>/.+?)\1', webpage)]
+
+ title = self._search_regex(
+ r'episodeShareTitle\s*=\s*(["\'])(?P<title>(?:(?!\1).)+)\1',
+ webpage, 'title',
+ default=strip_or_none(get_element_by_id(
+ 'video-title', webpage)), group='title')
+
+ return self.playlist_result(entries, show_id, title)
'ext': 'mp4',
'title': '2014 HOT6 CUP LAST BIG MATCH Ro8 Day 1',
'description': 'md5:d06bdea27b8cc4388a90ad35b5c66c01',
- 'thumbnail': 're:^https?://.*\.jpe?g',
+ 'thumbnail': r're:^https?://.*\.jpe?g',
'timestamp': 1417523507.334,
'upload_date': '20141202',
'duration': 9988.7,
'ext': 'mp4',
'title': 'Fnatic at Worlds 2014: Toyz - "I love Rekkles, he has amazing mechanics"',
'description': 'md5:4a649737b5f6c8b5c5be543e88dc62af',
- 'thumbnail': 're:^https?://.*\.jpe?g',
+ 'thumbnail': r're:^https?://.*\.jpe?g',
'timestamp': 1410530893.320,
'upload_date': '20140912',
'duration': 172.385,
'id': 'entropy-ep',
},
'playlist_mincount': 3,
+ }, {
+ # not all tracks have songs
+ 'url': 'https://insulters.bandcamp.com/album/we-are-the-plague',
+ 'info_dict': {
+ 'id': 'we-are-the-plague',
+ 'title': 'WE ARE THE PLAGUE',
+ 'uploader_id': 'insulters',
+ },
+ 'playlist_count': 2,
}]
def _real_extract(self, url):
album_id = mobj.group('album_id')
playlist_id = album_id or uploader_id
webpage = self._download_webpage(url, playlist_id)
- tracks_paths = re.findall(r'<a href="(.*?)" itemprop="url">', webpage)
- if not tracks_paths:
+ track_elements = re.findall(
+ r'(?s)<div[^>]*>(.*?<a[^>]+href="([^"]+?)"[^>]+itemprop="url"[^>]*>.*?)</div>', webpage)
+ if not track_elements:
raise ExtractorError('The page doesn\'t contain any tracks')
+ # Only tracks with duration info have songs
entries = [
self.url_result(compat_urlparse.urljoin(url, t_path), ie=BandcampIE.ie_key())
- for t_path in tracks_paths]
+ for elem_content, t_path in track_elements
+ if self._html_search_meta('duration', elem_content, default=None)]
+
title = self._html_search_regex(
r'album_title\s*:\s*"((?:\\.|[^"\\])+?)"',
webpage, 'title', fatal=False)
--- /dev/null
+# coding: utf-8
+from __future__ import unicode_literals
+
+from .common import InfoExtractor
+from ..utils import (
+ ExtractorError,
+ clean_html,
+ compat_str,
+ int_or_none,
+ parse_iso8601,
+ try_get,
+)
+
+
+class BeamProLiveIE(InfoExtractor):
+ IE_NAME = 'Beam:live'
+ _VALID_URL = r'https?://(?:\w+\.)?beam\.pro/(?P<id>[^/?#&]+)'
+ _RATINGS = {'family': 0, 'teen': 13, '18+': 18}
+ _TEST = {
+ 'url': 'http://www.beam.pro/niterhayven',
+ 'info_dict': {
+ 'id': '261562',
+ 'ext': 'mp4',
+ 'title': 'Introducing The Witcher 3 // The Grind Starts Now!',
+ 'description': 'md5:0b161ac080f15fe05d18a07adb44a74d',
+ 'thumbnail': r're:https://.*\.jpg$',
+ 'timestamp': 1483477281,
+ 'upload_date': '20170103',
+ 'uploader': 'niterhayven',
+ 'uploader_id': '373396',
+ 'age_limit': 18,
+ 'is_live': True,
+ 'view_count': int,
+ },
+ 'skip': 'niterhayven is offline',
+ 'params': {
+ 'skip_download': True,
+ },
+ }
+
+ def _real_extract(self, url):
+ channel_name = self._match_id(url)
+
+ chan = self._download_json(
+ 'https://beam.pro/api/v1/channels/%s' % channel_name, channel_name)
+
+ if chan.get('online') is False:
+ raise ExtractorError(
+ '{0} is offline'.format(channel_name), expected=True)
+
+ channel_id = chan['id']
+
+ formats = self._extract_m3u8_formats(
+ 'https://beam.pro/api/v1/channels/%s/manifest.m3u8' % channel_id,
+ channel_name, ext='mp4', m3u8_id='hls', fatal=False)
+ self._sort_formats(formats)
+
+ user_id = chan.get('userId') or try_get(chan, lambda x: x['user']['id'])
+
+ return {
+ 'id': compat_str(chan.get('id') or channel_name),
+ 'title': self._live_title(chan.get('name') or channel_name),
+ 'description': clean_html(chan.get('description')),
+ 'thumbnail': try_get(chan, lambda x: x['thumbnail']['url'], compat_str),
+ 'timestamp': parse_iso8601(chan.get('updatedAt')),
+ 'uploader': chan.get('token') or try_get(
+ chan, lambda x: x['user']['username'], compat_str),
+ 'uploader_id': compat_str(user_id) if user_id else None,
+ 'age_limit': self._RATINGS.get(chan.get('audience')),
+ 'is_live': True,
+ 'view_count': int_or_none(chan.get('viewersTotal')),
+ 'formats': formats,
+ }
'description': 'President Obama urges persistence in confronting racism and bias.',
'duration': 1534,
'upload_date': '20141208',
- 'thumbnail': 're:(?i)^https?://.*\.jpg$',
+ 'thumbnail': r're:(?i)^https?://.*\.jpg$',
'subtitles': {
'en': 'mincount:2',
}
'description': 'A BET News special.',
'duration': 1696,
'upload_date': '20141125',
- 'thumbnail': 're:(?i)^https?://.*\.jpg$',
+ 'thumbnail': r're:(?i)^https?://.*\.jpg$',
'subtitles': {
'en': 'mincount:2',
}
'ext': 'mp4',
'title': 'Das können die neuen iPads',
'description': 'md5:a4058c4fa2a804ab59c00d7244bbf62f',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'duration': 196,
}
}
import re
from .common import InfoExtractor
-from ..compat import compat_parse_qs
+from ..compat import (
+ compat_parse_qs,
+ compat_urlparse,
+)
from ..utils import (
+ ExtractorError,
int_or_none,
float_or_none,
+ parse_iso8601,
+ smuggle_url,
+ strip_jsonp,
unified_timestamp,
+ unsmuggle_url,
urlencode_postdata,
)
class BiliBiliIE(InfoExtractor):
- _VALID_URL = r'https?://(?:www\.|bangumi\.|)bilibili\.(?:tv|com)/(?:video/av|anime/v/)(?P<id>\d+)'
+ _VALID_URL = r'https?://(?:www\.|bangumi\.|)bilibili\.(?:tv|com)/(?:video/av|anime/(?P<anime_id>\d+)/play#)(?P<id>\d+)'
- _TEST = {
+ _TESTS = [{
'url': 'http://www.bilibili.tv/video/av1074402/',
'md5': '9fa226fe2b8a9a4d5a69b4c6a183417e',
'info_dict': {
'duration': 308.315,
'timestamp': 1398012660,
'upload_date': '20140420',
- 'thumbnail': 're:^https?://.+\.jpg',
+ 'thumbnail': r're:^https?://.+\.jpg',
'uploader': '菊子桑',
'uploader_id': '156160',
},
- }
+ }, {
+ # Tested in BiliBiliBangumiIE
+ 'url': 'http://bangumi.bilibili.com/anime/1869/play#40062',
+ 'only_matching': True,
+ }, {
+ 'url': 'http://bangumi.bilibili.com/anime/5802/play#100643',
+ 'md5': '3f721ad1e75030cc06faf73587cfec57',
+ 'info_dict': {
+ 'id': '100643',
+ 'ext': 'mp4',
+ 'title': 'CHAOS;CHILD',
+ 'description': '如果你是神明,并且能够让妄想成为现实。那你会进行怎么样的妄想?是淫靡的世界?独裁社会?毁灭性的制裁?还是……2015年,涩谷。从6年前发生的大灾害“涩谷地震”之后复兴了的这个街区里新设立的私立高中...',
+ },
+ 'skip': 'Geo-restricted to China',
+ }]
+
+ _APP_KEY = '84956560bc028eb7'
+ _BILIBILI_KEY = '94aba54af9065f71de72f5508f1cd42e'
- _APP_KEY = '6f90a59ac58a4123'
- _BILIBILI_KEY = '0bfd84cc3940035173f35e6777508326'
+ def _report_error(self, result):
+ if 'message' in result:
+ raise ExtractorError('%s said: %s' % (self.IE_NAME, result['message']), expected=True)
+ elif 'code' in result:
+ raise ExtractorError('%s returns error %d' % (self.IE_NAME, result['code']), expected=True)
+ else:
+ raise ExtractorError('Can\'t extract Bangumi episode ID')
def _real_extract(self, url):
- video_id = self._match_id(url)
+ url, smuggled_data = unsmuggle_url(url, {})
+
+ mobj = re.match(self._VALID_URL, url)
+ video_id = mobj.group('id')
+ anime_id = mobj.group('anime_id')
webpage = self._download_webpage(url, video_id)
- if 'anime/v' not in url:
+ if 'anime/' not in url:
cid = compat_parse_qs(self._search_regex(
[r'EmbedPlayer\([^)]+,\s*"([^"]+)"\)',
r'<iframe[^>]+src="https://secure\.bilibili\.com/secure,([^"]+)"'],
webpage, 'player parameters'))['cid'][0]
else:
+ if 'no_bangumi_tip' not in smuggled_data:
+ self.to_screen('Downloading episode %s. To download all videos in anime %s, re-run youtube-dl with %s' % (
+ video_id, anime_id, compat_urlparse.urljoin(url, '//bangumi.bilibili.com/anime/%s' % anime_id)))
+ headers = {
+ 'Content-Type': 'application/x-www-form-urlencoded; charset=UTF-8',
+ }
+ headers.update(self.geo_verification_headers())
+
js = self._download_json(
'http://bangumi.bilibili.com/web_api/get_source', video_id,
data=urlencode_postdata({'episode_id': video_id}),
- headers={'Content-Type': 'application/x-www-form-urlencoded; charset=UTF-8'})
+ headers=headers)
+ if 'result' not in js:
+ self._report_error(js)
cid = js['result']['cid']
payload = 'appkey=%s&cid=%s&otype=json&quality=2&type=mp4' % (self._APP_KEY, cid)
video_info = self._download_json(
'http://interface.bilibili.com/playurl?%s&sign=%s' % (payload, sign),
- video_id, note='Downloading video info page')
+ video_id, note='Downloading video info page',
+ headers=self.geo_verification_headers())
+
+ if 'durl' not in video_info:
+ self._report_error(video_info)
entries = []
title = self._html_search_regex('<h1[^>]+title="([^"]+)">', webpage, 'title')
description = self._html_search_meta('description', webpage)
timestamp = unified_timestamp(self._html_search_regex(
- r'<time[^>]+datetime="([^"]+)"', webpage, 'upload time', fatal=False))
+ r'<time[^>]+datetime="([^"]+)"', webpage, 'upload time', default=None))
thumbnail = self._html_search_meta(['og:image', 'thumbnailUrl'], webpage)
# TODO 'view_count' requires deobfuscating Javascript
}
uploader_mobj = re.search(
- r'<a[^>]+href="https?://space\.bilibili\.com/(?P<id>\d+)"[^>]+title="(?P<name>[^"]+)"',
+ r'<a[^>]+href="(?:https?:)?//space\.bilibili\.com/(?P<id>\d+)"[^>]+title="(?P<name>[^"]+)"',
webpage)
if uploader_mobj:
info.update({
'description': description,
'entries': entries,
}
+
+
+class BiliBiliBangumiIE(InfoExtractor):
+ _VALID_URL = r'https?://bangumi\.bilibili\.com/anime/(?P<id>\d+)'
+
+ IE_NAME = 'bangumi.bilibili.com'
+ IE_DESC = 'BiliBili番剧'
+
+ _TESTS = [{
+ 'url': 'http://bangumi.bilibili.com/anime/1869',
+ 'info_dict': {
+ 'id': '1869',
+ 'title': '混沌武士',
+ 'description': 'md5:6a9622b911565794c11f25f81d6a97d2',
+ },
+ 'playlist_count': 26,
+ }, {
+ 'url': 'http://bangumi.bilibili.com/anime/1869',
+ 'info_dict': {
+ 'id': '1869',
+ 'title': '混沌武士',
+ 'description': 'md5:6a9622b911565794c11f25f81d6a97d2',
+ },
+ 'playlist': [{
+ 'md5': '91da8621454dd58316851c27c68b0c13',
+ 'info_dict': {
+ 'id': '40062',
+ 'ext': 'mp4',
+ 'title': '混沌武士',
+ 'description': '故事发生在日本的江户时代。风是一个小酒馆的打工女。一日,酒馆里来了一群恶霸,虽然他们的举动令风十分不满,但是毕竟风只是一届女流,无法对他们采取什么行动,只能在心里嘟哝。这时,酒家里又进来了个“不良份子...',
+ 'timestamp': 1414538739,
+ 'upload_date': '20141028',
+ 'episode': '疾风怒涛 Tempestuous Temperaments',
+ 'episode_number': 1,
+ },
+ }],
+ 'params': {
+ 'playlist_items': '1',
+ },
+ }]
+
+ @classmethod
+ def suitable(cls, url):
+ return False if BiliBiliIE.suitable(url) else super(BiliBiliBangumiIE, cls).suitable(url)
+
+ def _real_extract(self, url):
+ bangumi_id = self._match_id(url)
+
+ # Sometimes this API returns a JSONP response
+ season_info = self._download_json(
+ 'http://bangumi.bilibili.com/jsonp/seasoninfo/%s.ver' % bangumi_id,
+ bangumi_id, transform_source=strip_jsonp)['result']
+
+ entries = [{
+ '_type': 'url_transparent',
+ 'url': smuggle_url(episode['webplay_url'], {'no_bangumi_tip': 1}),
+ 'ie_key': BiliBiliIE.ie_key(),
+ 'timestamp': parse_iso8601(episode.get('update_time'), delimiter=' '),
+ 'episode': episode.get('index_title'),
+ 'episode_number': int_or_none(episode.get('index')),
+ } for episode in season_info['episodes']]
+
+ entries = sorted(entries, key=lambda entry: entry.get('episode_number'))
+
+ return self.playlist_result(
+ entries, bangumi_id,
+ season_info.get('bangumi_title'), season_info.get('evaluate'))
'id': 'sobre-camaras-y-camarillas-parlamentarias',
'ext': 'mp4',
'title': 'Sobre Cámaras y camarillas parlamentarias',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'uploader': 'Fernando Atria',
},
'skip': 'URL expired and redirected to http://www.biobiochile.cl/portada/bbtv/index.html',
'id': 'natalia-valdebenito-repasa-a-diputado-hasbun-paso-a-la-categoria-de-hablar-brutalidades',
'ext': 'mp4',
'title': 'Natalia Valdebenito repasa a diputado Hasbún: Pasó a la categoría de hablar brutalidades',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'uploader': 'Piangella Obrador',
},
'params': {
name = self._match_id(url)
webpage = self._download_webpage(url, name)
video_id = self._search_regex(
- r'["\']bmmrId["\']\s*:\s*(["\'])(?P<url>.+?)\1',
+ (r'["\']bmmrId["\']\s*:\s*(["\'])(?P<url>(?:(?!\1).)+)\1',
+ r'videoId\s*:\s*(["\'])(?P<url>(?:(?!\1).)+)\1'),
webpage, 'id', group='url', default=None)
if not video_id:
bplayer_data = self._parse_json(self._search_regex(
from __future__ import unicode_literals
import re
-import json
from .common import InfoExtractor
+from ..compat import compat_str
from ..utils import (
int_or_none,
parse_age_limit,
class BreakIE(InfoExtractor):
- _VALID_URL = r'https?://(?:www\.)?break\.com/video/(?:[^/]+/)*.+-(?P<id>\d+)'
+ _VALID_URL = r'https?://(?:www\.)?(?P<site>break|screenjunkies)\.com/video/(?P<display_id>[^/]+?)(?:-(?P<id>\d+))?(?:[/?#&]|$)'
_TESTS = [{
'url': 'http://www.break.com/video/when-girls-act-like-guys-2468056',
'info_dict': {
'title': 'When Girls Act Like D-Bags',
'age_limit': 13,
}
+ }, {
+ 'url': 'http://www.screenjunkies.com/video/best-quentin-tarantino-movie-2841915',
+ 'md5': '5c2b686bec3d43de42bde9ec047536b0',
+ 'info_dict': {
+ 'id': '2841915',
+ 'display_id': 'best-quentin-tarantino-movie',
+ 'ext': 'mp4',
+ 'title': 'Best Quentin Tarantino Movie',
+ 'thumbnail': r're:^https?://.*\.jpg',
+ 'duration': 3671,
+ 'age_limit': 13,
+ 'tags': list,
+ },
+ }, {
+ 'url': 'http://www.screenjunkies.com/video/honest-trailers-the-dark-knight',
+ 'info_dict': {
+ 'id': '2348808',
+ 'display_id': 'honest-trailers-the-dark-knight',
+ 'ext': 'mp4',
+ 'title': 'Honest Trailers - The Dark Knight',
+ 'thumbnail': r're:^https?://.*\.(?:jpg|png)',
+ 'age_limit': 10,
+ 'tags': list,
+ },
+ }, {
+ # requires subscription but worked around
+ 'url': 'http://www.screenjunkies.com/video/knocking-dead-ep-1-the-show-so-far-3003285',
+ 'info_dict': {
+ 'id': '3003285',
+ 'display_id': 'knocking-dead-ep-1-the-show-so-far',
+ 'ext': 'mp4',
+ 'title': 'State of The Dead Recap: Knocking Dead Pilot',
+ 'thumbnail': r're:^https?://.*\.jpg',
+ 'duration': 3307,
+ 'age_limit': 13,
+ 'tags': list,
+ },
}, {
'url': 'http://www.break.com/video/ugc/baby-flex-2773063',
'only_matching': True,
}]
+ _DEFAULT_BITRATES = (48, 150, 320, 496, 864, 2240, 3264)
+
def _real_extract(self, url):
- video_id = self._match_id(url)
+ site, display_id, video_id = re.match(self._VALID_URL, url).groups()
+
+ if not video_id:
+ webpage = self._download_webpage(url, display_id)
+ video_id = self._search_regex(
+ (r'src=["\']/embed/(\d+)', r'data-video-content-id=["\'](\d+)'),
+ webpage, 'video id')
+
webpage = self._download_webpage(
- 'http://www.break.com/embed/%s' % video_id, video_id)
- info = json.loads(self._search_regex(
- r'var embedVars = ({.*})\s*?</script>',
- webpage, 'info json', flags=re.DOTALL))
+ 'http://www.%s.com/embed/%s' % (site, video_id),
+ display_id, 'Downloading video embed page')
+ embed_vars = self._parse_json(
+ self._search_regex(
+ r'(?s)embedVars\s*=\s*({.+?})\s*</script>', webpage, 'embed vars'),
+ display_id)
- youtube_id = info.get('youtubeId')
+ youtube_id = embed_vars.get('youtubeId')
if youtube_id:
return self.url_result(youtube_id, 'Youtube')
- formats = [{
- 'url': media['uri'] + '?' + info['AuthToken'],
- 'tbr': media['bitRate'],
- 'width': media['width'],
- 'height': media['height'],
- } for media in info['media'] if media.get('mediaPurpose') == 'play']
+ title = embed_vars['contentName']
- if not formats:
+ formats = []
+ bitrates = []
+ for f in embed_vars.get('media', []):
+ if not f.get('uri') or f.get('mediaPurpose') != 'play':
+ continue
+ bitrate = int_or_none(f.get('bitRate'))
+ if bitrate:
+ bitrates.append(bitrate)
formats.append({
- 'url': info['videoUri']
+ 'url': f['uri'],
+ 'format_id': 'http-%d' % bitrate if bitrate else 'http',
+ 'width': int_or_none(f.get('width')),
+ 'height': int_or_none(f.get('height')),
+ 'tbr': bitrate,
+ 'format': 'mp4',
})
- self._sort_formats(formats)
+ if not bitrates:
+ # When subscriptionLevel > 0, i.e. plus subscription is required
+ # media list will be empty. However, hds and hls uris are still
+ # available. We can grab them assuming bitrates to be default.
+ bitrates = self._DEFAULT_BITRATES
+
+ auth_token = embed_vars.get('AuthToken')
- duration = int_or_none(info.get('videoLengthInSeconds'))
- age_limit = parse_age_limit(info.get('audienceRating'))
+ def construct_manifest_url(base_url, ext):
+ pieces = [base_url]
+ pieces.extend([compat_str(b) for b in bitrates])
+ pieces.append('_kbps.mp4.%s?%s' % (ext, auth_token))
+ return ','.join(pieces)
+
+ if bitrates and auth_token:
+ hds_url = embed_vars.get('hdsUri')
+ if hds_url:
+ formats.extend(self._extract_f4m_formats(
+ construct_manifest_url(hds_url, 'f4m'),
+ display_id, f4m_id='hds', fatal=False))
+ hls_url = embed_vars.get('hlsUri')
+ if hls_url:
+ formats.extend(self._extract_m3u8_formats(
+ construct_manifest_url(hls_url, 'm3u8'),
+ display_id, 'mp4', entry_protocol='m3u8_native', m3u8_id='hls', fatal=False))
+ self._sort_formats(formats)
return {
'id': video_id,
- 'title': info['contentName'],
- 'thumbnail': info['thumbUri'],
- 'duration': duration,
- 'age_limit': age_limit,
+ 'display_id': display_id,
+ 'title': title,
+ 'thumbnail': embed_vars.get('thumbUri'),
+ 'duration': int_or_none(embed_vars.get('videoLengthInSeconds')) or None,
+ 'age_limit': parse_age_limit(embed_vars.get('audienceRating')),
+ 'tags': embed_vars.get('tags', '').split(','),
'formats': formats,
}
params = {}
- playerID = find_param('playerID')
+ playerID = find_param('playerID') or find_param('playerId')
if playerID is None:
raise ExtractorError('Cannot find player ID')
params['playerID'] = playerID
# // build Brightcove <object /> XML
# }
m = re.search(
- r'''(?x)customBC.\createVideo\(
+ r'''(?x)customBC\.createVideo\(
.*? # skipping width and height
["\'](?P<playerID>\d+)["\']\s*,\s* # playerID
["\'](?P<playerKey>AQ[^"\']{48})[^"\']*["\']\s*,\s* # playerKey begins with AQ and is 50 characters
"""Return a list of all Brightcove URLs from the webpage """
url_m = re.search(
- r'<meta\s+property=[\'"]og:video[\'"]\s+content=[\'"](https?://(?:secure|c)\.brightcove.com/[^\'"]+)[\'"]',
- webpage)
+ r'''(?x)
+ <meta\s+
+ (?:property|itemprop)=([\'"])(?:og:video|embedURL)\1[^>]+
+ content=([\'"])(?P<url>https?://(?:secure|c)\.brightcove.com/(?:(?!\2).)+)\2
+ ''', webpage)
if url_m:
- url = unescapeHTML(url_m.group(1))
+ url = unescapeHTML(url_m.group('url'))
# Some sites don't add it, we can't download with this url, for example:
# http://www.ktvu.com/videos/news/raw-video-caltrain-releases-video-of-man-almost/vCTZdY/
- if 'playerKey' in url or 'videoId' in url:
+ if 'playerKey' in url or 'videoId' in url or 'idVideo' in url:
return [url]
matches = re.findall(
url, smuggled_data = unsmuggle_url(url, {})
# Change the 'videoId' and others field to '@videoPlayer'
- url = re.sub(r'(?<=[?&])(videoI(d|D)|bctid)', '%40videoPlayer', url)
+ url = re.sub(r'(?<=[?&])(videoI(d|D)|idVideo|bctid)', '%40videoPlayer', url)
# Change bckey (used by bcove.me urls) to playerKey
url = re.sub(r'(?<=[?&])bckey', 'playerKey', url)
mobj = re.match(self._VALID_URL, url)
container = source.get('container')
ext = mimetype2ext(source.get('type'))
src = source.get('src')
- if ext == 'ism':
+ if ext == 'ism' or container == 'WVM':
continue
elif ext == 'm3u8' or container == 'M2TS':
if not src:
'ext': 'mp4',
'title': 'Season 5 Episode 5',
'description': 'md5:e07269172baff037f8e8bf9956bc9747',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'duration': 1486.486,
},
'params': {
'id': '5181',
'ext': 'mp4',
'title': 'Ch1-1 Introduction, Signals (02-23-2012)',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'creator': 'ss11spring',
'duration': 1591,
'upload_date': '20130114',
'id': '13885',
'ext': 'mp4',
'title': 'EverCam + Camdemy QuickStart',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'description': 'md5:2a9f989c2b153a2342acee579c6e7db6',
'creator': 'evercam',
'duration': 318,
(?:www\.)?d8\.tv|
(?:www\.)?c8\.fr|
(?:www\.)?d17\.tv|
+ (?:(?:football|www)\.)?cstar\.fr|
(?:www\.)?itele\.fr
)/(?:(?:[^/]+/)*(?P<display_id>[^/?#&]+))?(?:\?.*\bvid=(?P<vid>\d+))?|
player\.canalplus\.fr/#/(?P<id>\d+)
'd8': 'd8',
'c8': 'd8',
'd17': 'd17',
+ 'cstar': 'd17',
'itele': 'itele',
}
'description': 'Chaque matin du lundi au vendredi, Michaël Darmon reçoit un invité politique à 8h25.',
'upload_date': '20161014',
},
+ }, {
+ 'url': 'http://football.cstar.fr/cstar-minisite-foot/pid7566-feminines-videos.html?vid=1416769',
+ 'info_dict': {
+ 'id': '1416769',
+ 'display_id': 'pid7566-feminines-videos',
+ 'ext': 'mp4',
+ 'title': 'France - Albanie : les temps forts de la soirée - 20/09/2016',
+ 'description': 'md5:c3f30f2aaac294c1c969b3294de6904e',
+ 'upload_date': '20160921',
+ },
+ 'params': {
+ 'skip_download': True,
+ },
}, {
'url': 'http://m.canalplus.fr/?vid=1398231',
'only_matching': True,
webpage = self._download_webpage(url, display_id)
video_id = self._search_regex(
[r'<canal:player[^>]+?videoId=(["\'])(?P<id>\d+)',
- r'id=["\']canal_video_player(?P<id>\d+)'],
- webpage, 'video id', group='id')
+ r'id=["\']canal_video_player(?P<id>\d+)',
+ r'data-video=["\'](?P<id>\d+)'],
+ webpage, 'video id', default=mobj.group('vid'), group='id')
info_url = self._VIDEO_INFO_TEMPLATE % (site_id, video_id)
video_data = self._download_json(info_url, video_id, 'Downloading video JSON')
'ext': 'mp4',
'title': 'De afspraak veilt voor de Warmste Week',
'description': 'md5:24cb860c320dc2be7358e0e5aa317ba6',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'duration': 49.02,
}
}, {
'ext': 'mp4',
'title': 'Pieter 0167',
'description': 'md5:943cd30f48a5d29ba02c3a104dc4ec4e',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'duration': 2553.08,
'subtitles': {
'nl': [{
'ext': 'mp4',
'title': 'Herbekijk Sorry voor alles',
'description': 'md5:8bb2805df8164e5eb95d6a7a29dc0dd3',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'duration': 3788.06,
},
'params': {
elif format_type == 'HDS':
formats.extend(self._extract_f4m_formats(
format_url, display_id, f4m_id=format_type, fatal=False))
+ elif format_type == 'MPEG_DASH':
+ formats.extend(self._extract_mpd_formats(
+ format_url, display_id, mpd_id=format_type, fatal=False))
else:
formats.append({
'format_id': format_type,
'id': '191910501',
'ext': 'mp4',
'title': '[BadComedian] - Разборка в Маниле (Абсолютный обзор)',
- 'thumbnail': 're:^https?://.*\.jpg',
+ 'thumbnail': r're:^https?://.*\.jpg',
'duration': 2678.31,
},
}, {
'id': '475222',
'ext': 'flv',
'title': '[BadComedian] - Разборка в Маниле (Абсолютный обзор)',
- 'thumbnail': 're:^https?://.*\.jpg',
+ 'thumbnail': r're:^https?://.*\.jpg',
# duration reported by videomore is incorrect
'duration': int,
},
},
}],
'skip': 'Geo-restricted to Canada',
+ }, {
+ # multiple CBC.APP.Caffeine.initInstance(...)
+ 'url': 'http://www.cbc.ca/news/canada/calgary/dog-indoor-exercise-winter-1.3928238',
+ 'info_dict': {
+ 'title': 'Keep Rover active during the deep freeze with doggie pushups and other fun indoor tasks',
+ 'id': 'dog-indoor-exercise-winter-1.3928238',
+ },
+ 'playlist_mincount': 6,
}]
@classmethod
def suitable(cls, url):
return False if CBCPlayerIE.suitable(url) else super(CBCIE, cls).suitable(url)
+ def _extract_player_init(self, player_init, display_id):
+ player_info = self._parse_json(player_init, display_id, js_to_json)
+ media_id = player_info.get('mediaId')
+ if not media_id:
+ clip_id = player_info['clipId']
+ feed = self._download_json(
+ 'http://tpfeed.cbc.ca/f/ExhSPC/vms_5akSXx4Ng_Zn?byCustomValue={:mpsReleases}{%s}' % clip_id,
+ clip_id, fatal=False)
+ if feed:
+ media_id = try_get(feed, lambda x: x['entries'][0]['guid'], compat_str)
+ if not media_id:
+ media_id = self._download_json(
+ 'http://feed.theplatform.com/f/h9dtGB/punlNGjMlc1F?fields=id&byContent=byReleases%3DbyId%253D' + clip_id,
+ clip_id)['entries'][0]['id'].split('/')[-1]
+ return self.url_result('cbcplayer:%s' % media_id, 'CBCPlayer', media_id)
+
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
- player_init = self._search_regex(
- r'CBC\.APP\.Caffeine\.initInstance\(({.+?})\);', webpage, 'player init',
- default=None)
- if player_init:
- player_info = self._parse_json(player_init, display_id, js_to_json)
- media_id = player_info.get('mediaId')
- if not media_id:
- clip_id = player_info['clipId']
- feed = self._download_json(
- 'http://tpfeed.cbc.ca/f/ExhSPC/vms_5akSXx4Ng_Zn?byCustomValue={:mpsReleases}{%s}' % clip_id,
- clip_id, fatal=False)
- if feed:
- media_id = try_get(feed, lambda x: x['entries'][0]['guid'], compat_str)
- if not media_id:
- media_id = self._download_json(
- 'http://feed.theplatform.com/f/h9dtGB/punlNGjMlc1F?fields=id&byContent=byReleases%3DbyId%253D' + clip_id,
- clip_id)['entries'][0]['id'].split('/')[-1]
- return self.url_result('cbcplayer:%s' % media_id, 'CBCPlayer', media_id)
- else:
- entries = [self.url_result('cbcplayer:%s' % media_id, 'CBCPlayer', media_id) for media_id in re.findall(r'<iframe[^>]+src="[^"]+?mediaId=(\d+)"', webpage)]
- return self.playlist_result(entries)
+ entries = [
+ self._extract_player_init(player_init, display_id)
+ for player_init in re.findall(r'CBC\.APP\.Caffeine\.initInstance\(({.+?})\);', webpage)]
+ entries.extend([
+ self.url_result('cbcplayer:%s' % media_id, 'CBCPlayer', media_id)
+ for media_id in re.findall(r'<iframe[^>]+src="[^"]+?mediaId=(\d+)"', webpage)])
+ return self.playlist_result(
+ entries, display_id,
+ self._og_search_title(webpage, fatal=False),
+ self._og_search_description(webpage))
class CBCPlayerIE(InfoExtractor):
formats = self._extract_m3u8_formats(re.sub(r'/([^/]+)/[^/?]+\.m3u8', r'/\1/\1.m3u8', m3u8_url), video_id, 'mp4', fatal=False)
if len(formats) < 2:
formats = self._extract_m3u8_formats(m3u8_url, video_id, 'mp4')
- # Despite metadata in m3u8 all video+audio formats are
- # actually video-only (no audio)
for f in formats:
- if f.get('acodec') != 'none' and f.get('vcodec') != 'none':
- f['acodec'] = 'none'
+ format_id = f.get('format_id')
+ if format_id.startswith('AAC'):
+ f['acodec'] = 'aac'
+ elif format_id.startswith('AC3'):
+ f['acodec'] = 'ac-3'
self._sort_formats(formats)
info = {
'upload_date': '20140404',
'timestamp': 1396650660,
'uploader': 'CBSI-NEW',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'duration': 205,
'subtitles': {
'en': [{
'ext': 'mp4',
'title': 'Introduction to Processor Design',
'description': 'md5:df55f6d073d4ceae55aae6f2fd98a0ac',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'upload_date': '20131228',
'timestamp': 1388188800,
'duration': 3710,
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
- event_id = self._search_regex("data-id='(\d+)'", webpage, 'event id')
+ event_id = self._search_regex(r"data-id='(\d+)'", webpage, 'event id')
event_data = self._download_json('https://media.ccc.de/public/events/%s' % event_id, event_id)
formats = []
--- /dev/null
+# coding: utf-8
+from __future__ import unicode_literals
+
+import re
+
+from .common import InfoExtractor
+from ..utils import (
+ int_or_none,
+ parse_duration,
+ parse_iso8601,
+ clean_html,
+)
+
+
+class CCMAIE(InfoExtractor):
+ _VALID_URL = r'https?://(?:www\.)?ccma\.cat/(?:[^/]+/)*?(?P<type>video|audio)/(?P<id>\d+)'
+ _TESTS = [{
+ 'url': 'http://www.ccma.cat/tv3/alacarta/lespot-de-la-marato-de-tv3/lespot-de-la-marato-de-tv3/video/5630208/',
+ 'md5': '7296ca43977c8ea4469e719c609b0871',
+ 'info_dict': {
+ 'id': '5630208',
+ 'ext': 'mp4',
+ 'title': 'L\'espot de La Marató de TV3',
+ 'description': 'md5:f12987f320e2f6e988e9908e4fe97765',
+ 'timestamp': 1470918540,
+ 'upload_date': '20160811',
+ }
+ }, {
+ 'url': 'http://www.ccma.cat/catradio/alacarta/programa/el-consell-de-savis-analitza-el-derbi/audio/943685/',
+ 'md5': 'fa3e38f269329a278271276330261425',
+ 'info_dict': {
+ 'id': '943685',
+ 'ext': 'mp3',
+ 'title': 'El Consell de Savis analitza el derbi',
+ 'description': 'md5:e2a3648145f3241cb9c6b4b624033e53',
+ 'upload_date': '20171205',
+ 'timestamp': 1512507300,
+ }
+ }]
+
+ def _real_extract(self, url):
+ media_type, media_id = re.match(self._VALID_URL, url).groups()
+ media_data = {}
+ formats = []
+ profiles = ['pc'] if media_type == 'audio' else ['mobil', 'pc']
+ for i, profile in enumerate(profiles):
+ md = self._download_json('http://dinamics.ccma.cat/pvideo/media.jsp', media_id, query={
+ 'media': media_type,
+ 'idint': media_id,
+ 'profile': profile,
+ }, fatal=False)
+ if md:
+ media_data = md
+ media_url = media_data.get('media', {}).get('url')
+ if media_url:
+ formats.append({
+ 'format_id': profile,
+ 'url': media_url,
+ 'quality': i,
+ })
+ self._sort_formats(formats)
+
+ informacio = media_data['informacio']
+ title = informacio['titol']
+ durada = informacio.get('durada', {})
+ duration = int_or_none(durada.get('milisegons'), 1000) or parse_duration(durada.get('text'))
+ timestamp = parse_iso8601(informacio.get('data_emissio', {}).get('utc'))
+
+ subtitles = {}
+ subtitols = media_data.get('subtitols', {})
+ if subtitols:
+ sub_url = subtitols.get('url')
+ if sub_url:
+ subtitles.setdefault(
+ subtitols.get('iso') or subtitols.get('text') or 'ca', []).append({
+ 'url': sub_url,
+ })
+
+ thumbnails = []
+ imatges = media_data.get('imatges', {})
+ if imatges:
+ thumbnail_url = imatges.get('url')
+ if thumbnail_url:
+ thumbnails = [{
+ 'url': thumbnail_url,
+ 'width': int_or_none(imatges.get('amplada')),
+ 'height': int_or_none(imatges.get('alcada')),
+ }]
+
+ return {
+ 'id': media_id,
+ 'title': title,
+ 'description': clean_html(informacio.get('descripcio')),
+ 'duration': duration,
+ 'timestamp': timestamp,
+ 'thumnails': thumbnails,
+ 'subtitles': subtitles,
+ 'formats': formats,
+ }
import re
from .common import InfoExtractor
-from ..utils import float_or_none
+from ..compat import compat_str
+from ..utils import (
+ float_or_none,
+ try_get,
+ unified_timestamp,
+)
class CCTVIE(InfoExtractor):
- _VALID_URL = r'''(?x)https?://(?:.+?\.)?
- (?:
- cctv\.(?:com|cn)|
- cntv\.cn
- )/
- (?:
- video/[^/]+/(?P<id>[0-9a-f]{32})|
- \d{4}/\d{2}/\d{2}/(?P<display_id>VID[0-9A-Za-z]+)
- )'''
+ IE_DESC = '央视网'
+ _VALID_URL = r'https?://(?:(?:[^/]+)\.(?:cntv|cctv)\.(?:com|cn)|(?:www\.)?ncpa-classic\.com)/(?:[^/]+/)*?(?P<id>[^/?#&]+?)(?:/index)?(?:\.s?html|[?#&]|$)'
_TESTS = [{
- 'url': 'http://english.cntv.cn/2016/09/03/VIDEhnkB5y9AgHyIEVphCEz1160903.shtml',
- 'md5': '819c7b49fc3927d529fb4cd555621823',
+ # fo.addVariable("videoCenterId","id")
+ 'url': 'http://sports.cntv.cn/2016/02/12/ARTIaBRxv4rTT1yWf1frW2wi160212.shtml',
+ 'md5': 'd61ec00a493e09da810bf406a078f691',
'info_dict': {
- 'id': '454368eb19ad44a1925bf1eb96140a61',
+ 'id': '5ecdbeab623f4973b40ff25f18b174e8',
'ext': 'mp4',
- 'title': 'Portrait of Real Current Life 09/03/2016 Modern Inventors Part 1',
- }
+ 'title': '[NBA]二少联手砍下46分 雷霆主场击败鹈鹕(快讯)',
+ 'description': 'md5:7e14a5328dc5eb3d1cd6afbbe0574e95',
+ 'duration': 98,
+ 'uploader': 'songjunjie',
+ 'timestamp': 1455279956,
+ 'upload_date': '20160212',
+ },
+ }, {
+ # var guid = "id"
+ 'url': 'http://tv.cctv.com/2016/02/05/VIDEUS7apq3lKrHG9Dncm03B160205.shtml',
+ 'info_dict': {
+ 'id': 'efc5d49e5b3b4ab2b34f3a502b73d3ae',
+ 'ext': 'mp4',
+ 'title': '[赛车]“车王”舒马赫恢复情况成谜(快讯)',
+ 'description': '2月4日,蒙特泽莫罗透露了关于“车王”舒马赫恢复情况,但情况是否属实遭到了质疑。',
+ 'duration': 37,
+ 'uploader': 'shujun',
+ 'timestamp': 1454677291,
+ 'upload_date': '20160205',
+ },
+ 'params': {
+ 'skip_download': True,
+ },
+ }, {
+ # changePlayer('id')
+ 'url': 'http://english.cntv.cn/special/four_comprehensives/index.shtml',
+ 'info_dict': {
+ 'id': '4bb9bb4db7a6471ba85fdeda5af0381e',
+ 'ext': 'mp4',
+ 'title': 'NHnews008 ANNUAL POLITICAL SEASON',
+ 'description': 'Four Comprehensives',
+ 'duration': 60,
+ 'uploader': 'zhangyunlei',
+ 'timestamp': 1425385521,
+ 'upload_date': '20150303',
+ },
+ 'params': {
+ 'skip_download': True,
+ },
+ }, {
+ # loadvideo('id')
+ 'url': 'http://cctv.cntv.cn/lm/tvseries_russian/yilugesanghua/index.shtml',
+ 'info_dict': {
+ 'id': 'b15f009ff45c43968b9af583fc2e04b2',
+ 'ext': 'mp4',
+ 'title': 'Путь,усыпанный космеями Серия 1',
+ 'description': 'Путь, усыпанный космеями',
+ 'duration': 2645,
+ 'uploader': 'renxue',
+ 'timestamp': 1477479241,
+ 'upload_date': '20161026',
+ },
+ 'params': {
+ 'skip_download': True,
+ },
+ }, {
+ # var initMyAray = 'id'
+ 'url': 'http://www.ncpa-classic.com/2013/05/22/VIDE1369219508996867.shtml',
+ 'info_dict': {
+ 'id': 'a194cfa7f18c426b823d876668325946',
+ 'ext': 'mp4',
+ 'title': '小泽征尔音乐塾 音乐梦想无国界',
+ 'duration': 2173,
+ 'timestamp': 1369248264,
+ 'upload_date': '20130522',
+ },
+ 'params': {
+ 'skip_download': True,
+ },
+ }, {
+ # var ids = ["id"]
+ 'url': 'http://www.ncpa-classic.com/clt/more/416/index.shtml',
+ 'info_dict': {
+ 'id': 'a8606119a4884588a79d81c02abecc16',
+ 'ext': 'mp3',
+ 'title': '来自维也纳的新年贺礼',
+ 'description': 'md5:f13764ae8dd484e84dd4b39d5bcba2a7',
+ 'duration': 1578,
+ 'uploader': 'djy',
+ 'timestamp': 1482942419,
+ 'upload_date': '20161228',
+ },
+ 'params': {
+ 'skip_download': True,
+ },
+ 'expected_warnings': ['Failed to download m3u8 information'],
+ }, {
+ 'url': 'http://ent.cntv.cn/2016/01/18/ARTIjprSSJH8DryTVr5Bx8Wb160118.shtml',
+ 'only_matching': True,
+ }, {
+ 'url': 'http://tv.cntv.cn/video/C39296/e0210d949f113ddfb38d31f00a4e5c44',
+ 'only_matching': True,
+ }, {
+ 'url': 'http://english.cntv.cn/2016/09/03/VIDEhnkB5y9AgHyIEVphCEz1160903.shtml',
+ 'only_matching': True,
}, {
'url': 'http://tv.cctv.com/2016/09/07/VIDE5C1FnlX5bUywlrjhxXOV160907.shtml',
'only_matching': True,
}, {
'url': 'http://tv.cntv.cn/video/C39296/95cfac44cabd3ddc4a9438780a4e5c44',
- 'only_matching': True
+ 'only_matching': True,
}]
def _real_extract(self, url):
- video_id, display_id = re.match(self._VALID_URL, url).groups()
- if not video_id:
- webpage = self._download_webpage(url, display_id)
- video_id = self._search_regex(
- r'(?:fo\.addVariable\("videoCenterId",\s*|guid\s*=\s*)"([0-9a-f]{32})',
- webpage, 'video_id')
- api_data = self._download_json(
- 'http://vdn.apps.cntv.cn/api/getHttpVideoInfo.do?pid=' + video_id, video_id)
- m3u8_url = re.sub(r'maxbr=\d+&?', '', api_data['hls_url'])
+ video_id = self._match_id(url)
+ webpage = self._download_webpage(url, video_id)
+
+ video_id = self._search_regex(
+ [r'var\s+guid\s*=\s*["\']([\da-fA-F]+)',
+ r'videoCenterId["\']\s*,\s*["\']([\da-fA-F]+)',
+ r'changePlayer\s*\(\s*["\']([\da-fA-F]+)',
+ r'load[Vv]ideo\s*\(\s*["\']([\da-fA-F]+)',
+ r'var\s+initMyAray\s*=\s*["\']([\da-fA-F]+)',
+ r'var\s+ids\s*=\s*\[["\']([\da-fA-F]+)'],
+ webpage, 'video id')
+
+ data = self._download_json(
+ 'http://vdn.apps.cntv.cn/api/getHttpVideoInfo.do', video_id,
+ query={
+ 'pid': video_id,
+ 'url': url,
+ 'idl': 32,
+ 'idlr': 32,
+ 'modifyed': 'false',
+ })
+
+ title = data['title']
+
+ formats = []
+
+ video = data.get('video')
+ if isinstance(video, dict):
+ for quality, chapters_key in enumerate(('lowChapters', 'chapters')):
+ video_url = try_get(
+ video, lambda x: x[chapters_key][0]['url'], compat_str)
+ if video_url:
+ formats.append({
+ 'url': video_url,
+ 'format_id': 'http',
+ 'quality': quality,
+ 'preference': -1,
+ })
+
+ hls_url = try_get(data, lambda x: x['hls_url'], compat_str)
+ if hls_url:
+ hls_url = re.sub(r'maxbr=\d+&?', '', hls_url)
+ formats.extend(self._extract_m3u8_formats(
+ hls_url, video_id, 'mp4', entry_protocol='m3u8_native',
+ m3u8_id='hls', fatal=False))
+
+ self._sort_formats(formats)
+
+ uploader = data.get('editer_name')
+ description = self._html_search_meta(
+ 'description', webpage, default=None)
+ timestamp = unified_timestamp(data.get('f_pgmtime'))
+ duration = float_or_none(try_get(video, lambda x: x['totalLength']))
return {
'id': video_id,
- 'title': api_data['title'],
- 'formats': self._extract_m3u8_formats(
- m3u8_url, video_id, 'mp4', 'm3u8_native', fatal=False),
- 'duration': float_or_none(api_data.get('video', {}).get('totalLength')),
+ 'title': title,
+ 'description': description,
+ 'uploader': uploader,
+ 'timestamp': timestamp,
+ 'duration': duration,
+ 'formats': formats,
}
'height': 720,
'title': 'Oto dlaczego przed zakrętem należy zwolnić.',
'description': 'md5:269ccd135d550da90d1662651fcb9772',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'average_rating': float,
'duration': 39
}
'ext': 'mp4',
'title': 'Lądowanie na lotnisku na Maderze',
'description': 'md5:60d76b71186dcce4e0ba6d4bbdb13e1a',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'uploader': 'crash404',
'view_count': int,
'average_rating': float,
'ext': 'mp4',
'title': 'Hyde Park Civilizace',
'description': 'md5:fe93f6eda372d150759d11644ebbfb4a',
- 'thumbnail': 're:^https?://.*\.jpg',
+ 'thumbnail': r're:^https?://.*\.jpg',
'duration': 3350,
},
'params': {
'ext': 'mp4',
'title': 'Hyde Park Civilizace: Bonus 01 - En',
'description': 'English Subtittles',
- 'thumbnail': 're:^https?://.*\.jpg',
+ 'thumbnail': r're:^https?://.*\.jpg',
'duration': 81.3,
},
'params': {
'info_dict': {
'id': 402,
'ext': 'mp4',
- 'title': 're:^ČT Sport \d{4}-\d{2}-\d{2} \d{2}:\d{2}$',
+ 'title': r're:^ČT Sport \d{4}-\d{2}-\d{2} \d{2}:\d{2}$',
'is_live': True,
},
'params': {
'id': '61924494877068022',
'ext': 'mp4',
'title': 'Queer: Bogotart (Queer)',
- 'thumbnail': 're:^https?://.*\.jpg',
+ 'thumbnail': r're:^https?://.*\.jpg',
'duration': 1558.3,
},
}],
'title': 'Developer Kick-Off Session: Stuff We Love',
'description': 'md5:c08d72240b7c87fcecafe2692f80e35f',
'duration': 4576,
- 'thumbnail': 're:http://.*\.jpg',
+ 'thumbnail': r're:http://.*\.jpg',
'session_code': 'KOS002',
'session_day': 'Day 1',
'session_room': 'Arena 1A',
'title': 'Self-service BI with Power BI - nuclear testing',
'description': 'md5:d1e6ecaafa7fb52a2cacdf9599829f5b',
'duration': 1540,
- 'thumbnail': 're:http://.*\.jpg',
+ 'thumbnail': r're:http://.*\.jpg',
'authors': ['Mike Wilmot'],
},
}, {
'title': 'Ranges for the Standard Library',
'description': 'md5:2e6b4917677af3728c5f6d63784c4c5d',
'duration': 5646,
- 'thumbnail': 're:http://.*\.jpg',
+ 'thumbnail': r're:http://.*\.jpg',
},
'params': {
'skip_download': True,
'id': '27996',
'ext': 'mp4',
'title': 'Remembering Zaha Hadid',
- 'thumbnail': 're:^https?://.*\.jpg\?\d+',
+ 'thumbnail': r're:^https?://.*\.jpg\?\d+',
'description': 'We revisit past conversations with Zaha Hadid, in memory of the world renowned Iraqi architect.',
'subtitles': {
'en': [{
from __future__ import unicode_literals
+import re
+
from .common import InfoExtractor
from ..utils import ExtractorError
webpage = self._download_webpage(url, video_id)
- m3u8_url = self._search_regex(
- r'src=(["\'])(?P<url>http.+?\.m3u8.*?)\1', webpage,
- 'playlist', default=None, group='url')
+ m3u8_formats = [(m.group('id').lower(), m.group('url')) for m in re.finditer(
+ r'hlsSource(?P<id>.+?)\s*=\s*(?P<q>["\'])(?P<url>http.+?)(?P=q)', webpage)]
- if not m3u8_url:
+ if not m3u8_formats:
error = self._search_regex(
[r'<span[^>]+class=(["\'])desc_span\1[^>]*>(?P<error>[^<]+)</span>',
r'<div[^>]+id=(["\'])defchat\1[^>]*>\s*<p><strong>(?P<error>[^<]+)<'],
webpage, 'error', group='error', default=None)
if not error:
- if any(p not in webpage for p in (
+ if any(p in webpage for p in (
self._ROOM_OFFLINE, 'offline_tipping', 'tip_offline')):
error = self._ROOM_OFFLINE
if error:
raise ExtractorError(error, expected=True)
raise ExtractorError('Unable to find stream URL')
- formats = self._extract_m3u8_formats(m3u8_url, video_id, ext='mp4')
+ formats = []
+ for m3u8_id, m3u8_url in m3u8_formats:
+ formats.extend(self._extract_m3u8_formats(
+ m3u8_url, video_id, ext='mp4',
+ # ffmpeg skips segments for fast m3u8
+ preference=-10 if m3u8_id == 'fast' else None,
+ m3u8_id=m3u8_id, fatal=False, live=True))
self._sort_formats(formats)
return {
'id': video_id,
'title': self._live_title(video_id),
- 'thumbnail': 'https://cdn-s.highwebmedia.com/uHK3McUtGCG3SMFcd4ZJsRv8/roomimage/%s.jpg' % video_id,
+ 'thumbnail': 'https://roomimg.stream.highwebmedia.com/ri/%s.jpg' % video_id,
'age_limit': self._rta_search(webpage),
'is_live': True,
'formats': formats,
'title': 'md5:f542ea253f5255240be4da375c6a5d7e',
'description': 'md5:f24a4e22a71763e32da5fed59e47c770',
'duration': 306,
+ 'uploader': 'Gerryaudio',
},
'params': {
'skip_download': True,
duration = parse_duration(self._search_regex(
r'class=["\']c-length["\'][^>]*>([^<]+)',
webpage, 'duration', fatal=False))
+ uploader = self._search_regex(
+ r'id=["\']chirbit-username["\'][^>]*>([^<]+)',
+ webpage, 'uploader', fatal=False)
return {
'id': audio_id,
'title': title,
'description': description,
'duration': duration,
+ 'uploader': uploader,
}
'id': '1012420',
'ext': 'flv',
'title': 'Fun Jynx Maze solo',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'age_limit': 18,
},
'skip': 'Video gone',
'id': '2019449',
'ext': 'mp4',
'title': 'ShesNew - My booty girlfriend, Victoria Paradice\'s pussy filled with jizz',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'age_limit': 18,
},
}]
'ext': 'mp4',
'title': 'Brick Briscoe',
'duration': 612,
- 'thumbnail': 're:^https?://.+\.jpg',
+ 'thumbnail': r're:^https?://.+\.jpg',
},
}, {
'url': 'http://chic.clipsyndicate.com/video/play/5844117/shark_attack',
'ext': 'mp4',
'title': 'Clubic Week 2.0 : le FBI se lance dans la photo d\u0092identité',
'description': 're:Gueule de bois chez Nokia. Le constructeur a indiqué cette.*',
- 'thumbnail': 're:^http://img\.clubic\.com/.*\.jpg$',
+ 'thumbnail': r're:^http://img\.clubic\.com/.*\.jpg$',
}
}, {
'url': 'http://www.clubic.com/video/video-clubic-week-2-0-apple-iphone-6s-et-plus-mais-surtout-le-pencil-469792.html',
from __future__ import unicode_literals
from .mtv import MTVIE
-from ..utils import ExtractorError
class CMTIE(MTVIE):
IE_NAME = 'cmt.com'
- _VALID_URL = r'https?://(?:www\.)?cmt\.com/(?:videos|shows)/(?:[^/]+/)*(?P<videoid>\d+)'
- _FEED_URL = 'http://www.cmt.com/sitewide/apps/player/embed/rss/'
+ _VALID_URL = r'https?://(?:www\.)?cmt\.com/(?:videos|shows|(?:full-)?episodes|video-clips)/(?P<id>[^/]+)'
_TESTS = [{
'url': 'http://www.cmt.com/videos/garth-brooks/989124/the-call-featuring-trisha-yearwood.jhtml#artist=30061',
}, {
'url': 'http://www.cmt.com/shows/party-down-south/party-down-south-ep-407-gone-girl/1738172/playlist/#id=1738172',
'only_matching': True,
+ }, {
+ 'url': 'http://www.cmt.com/full-episodes/537qb3/nashville-the-wayfaring-stranger-season-5-ep-501',
+ 'only_matching': True,
+ }, {
+ 'url': 'http://www.cmt.com/video-clips/t9e4ci/nashville-juliette-in-2-minutes',
+ 'only_matching': True,
}]
- @classmethod
- def _transform_rtmp_url(cls, rtmp_video_url):
- if 'error_not_available.swf' in rtmp_video_url:
- raise ExtractorError(
- '%s said: video is not available' % cls.IE_NAME, expected=True)
-
- return super(CMTIE, cls)._transform_rtmp_url(rtmp_video_url)
-
def _extract_mgid(self, webpage):
- return self._search_regex(
+ mgid = self._search_regex(
r'MTVN\.VIDEO\.contentUri\s*=\s*([\'"])(?P<mgid>.+?)\1',
- webpage, 'mgid', group='mgid')
+ webpage, 'mgid', group='mgid', default=None)
+ if not mgid:
+ mgid = self._extract_triforce_mgid(webpage)
+ return mgid
+
+ def _real_extract(self, url):
+ video_id = self._match_id(url)
+ webpage = self._download_webpage(url, video_id)
+ mgid = self._extract_mgid(webpage)
+ return self.url_result('http://media.mtvnservices.com/embed/%s' % mgid)
'ext': 'mp4',
'title': 'Een nieuwe wereld: waarden, bewustzijn en techniek van de mensheid 2.0.',
'description': '',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'duration': 7713.088,
'timestamp': 1413309600,
'upload_date': '20141014',
def _real_extract(self, url):
playlist_id = self._match_id(url)
webpage = self._download_webpage(url, playlist_id)
-
- feed_json = self._search_regex(r'var triforceManifestFeed\s*=\s*(\{.+?\});\n', webpage, 'triforce feeed')
- feed = self._parse_json(feed_json, playlist_id)
- zones = feed['manifest']['zones']
-
- video_zone = zones['t2_lc_promo1']
- feed = self._download_json(video_zone['feed'], playlist_id)
- mgid = feed['result']['data']['id']
-
+ mgid = self._extract_triforce_mgid(webpage, data_zone='t2_lc_promo1')
videos_info = self._get_videos_info(mgid)
return videos_info
'ext': 'mp4',
'title': 'Tosh.0|June 9, 2077|2|211|Twitter Users Share Summer Plans',
'description': 'Tosh asked fans to share their summer plans.',
- 'thumbnail': 're:^https?://.*\.jpg',
+ 'thumbnail': r're:^https?://.*\.jpg',
# It's really reported to be published on year 2077
'upload_date': '20770610',
'timestamp': 3390510600,
'only_matching': True,
}]
- @classmethod
- def _transform_rtmp_url(cls, rtmp_video_url):
- new_urls = super(ToshIE, cls)._transform_rtmp_url(rtmp_video_url)
- new_urls['rtmp'] = rtmp_video_url.replace('viacomccstrm', 'viacommtvstrm')
- return new_urls
-
class ComedyCentralTVIE(MTVServicesInfoExtractor):
_VALID_URL = r'https?://(?:www\.)?comedycentral\.tv/(?:staffeln|shows)/(?P<id>[^/?#&]+)'
parse_m3u8_attributes,
extract_attributes,
parse_codecs,
+ urljoin,
)
download, lower-case.
"http", "https", "rtsp", "rtmp", "rtmpe",
"m3u8", "m3u8_native" or "http_dash_segments".
- * fragments A list of fragments of the fragmented media,
- with the following entries:
- * "url" (mandatory) - fragment's URL
+ * fragment_base_url
+ Base URL for fragments. Each fragment's path
+ value (if present) will be relative to
+ this URL.
+ * fragments A list of fragments of a fragmented media.
+ Each fragment entry must contain either an url
+ or a path. If an url is present it should be
+ considered by a client. Otherwise both path and
+ fragment_base_url must be present. Here is
+ the list of all potential fields:
+ * "url" - fragment's URL
+ * "path" - fragment's path relative to
+ fragment_base_url
* "duration" (optional, int or float)
* "filesize" (optional, int)
* preference Order number of this format. If this field is
uploader_url: Full URL to a personal webpage of the video uploader.
location: Physical location where the video was filmed.
subtitles: The available subtitles as a dictionary in the format
- {language: subformats}. "subformats" is a list sorted from
- lower to higher preference, each element is a dictionary
- with the "ext" entry and one of:
+ {tag: subformats}. "tag" is usually a language code, and
+ "subformats" is a list sorted from lower to higher
+ preference, each element is a dictionary with the "ext"
+ entry and one of:
* "data": The subtitles file contents
* "url": A URL pointing to the subtitles file
"ext" will be calculated from URL if missing
unique_formats.append(f)
formats[:] = unique_formats
- def _is_valid_url(self, url, video_id, item='video'):
+ def _is_valid_url(self, url, video_id, item='video', headers={}):
url = self._proto_relative_url(url, scheme='http:')
# For now assume non HTTP(S) URLs always valid
if not (url.startswith('http://') or url.startswith('https://')):
return True
try:
- self._request_webpage(url, video_id, 'Checking %s URL' % item)
+ self._request_webpage(url, video_id, 'Checking %s URL' % item, headers=headers)
return True
except ExtractorError as e:
if isinstance(e.cause, compat_urllib_error.URLError):
'protocol': entry_protocol,
'preference': preference,
}]
+ audio_in_video_stream = {}
last_info = {}
last_media = {}
for line in m3u8_doc.splitlines():
media = parse_m3u8_attributes(line)
media_type = media.get('TYPE')
if media_type in ('VIDEO', 'AUDIO'):
+ group_id = media.get('GROUP-ID')
media_url = media.get('URI')
if media_url:
format_id = []
- for v in (media.get('GROUP-ID'), media.get('NAME')):
+ for v in (group_id, media.get('NAME')):
if v:
format_id.append(v)
- formats.append({
+ f = {
'format_id': '-'.join(format_id),
'url': format_url(media_url),
'language': media.get('LANGUAGE'),
- 'vcodec': 'none' if media_type == 'AUDIO' else None,
'ext': ext,
'protocol': entry_protocol,
'preference': preference,
- })
+ }
+ if media_type == 'AUDIO':
+ f['vcodec'] = 'none'
+ if group_id and not audio_in_video_stream.get(group_id):
+ audio_in_video_stream[group_id] = False
+ formats.append(f)
else:
# When there is no URI in EXT-X-MEDIA let this tag's
# data be used by regular URI lines below
last_media = media
+ if media_type == 'AUDIO' and group_id:
+ audio_in_video_stream[group_id] = True
elif line.startswith('#') or not line.strip():
continue
else:
'abr': abr,
})
f.update(parse_codecs(last_info.get('CODECS')))
+ if audio_in_video_stream.get(last_info.get('AUDIO')) is False and f['vcodec'] != 'none':
+ # TODO: update acodec for audio only formats with the same GROUP-ID
+ f['acodec'] = 'none'
formats.append(f)
last_info = {}
last_media = {}
segment_template = element.find(_add_ns('SegmentTemplate'))
if segment_template is not None:
extract_common(segment_template)
- media_template = segment_template.get('media')
- if media_template:
- ms_info['media_template'] = media_template
+ media = segment_template.get('media')
+ if media:
+ ms_info['media'] = media
initialization = segment_template.get('initialization')
if initialization:
- ms_info['initialization_url'] = initialization
+ ms_info['initialization'] = initialization
else:
extract_Initialization(segment_template)
return ms_info
- def combine_url(base_url, target_url):
- if re.match(r'^https?://', target_url):
- return target_url
- return '%s%s%s' % (base_url, '' if base_url.endswith('/') else '/', target_url)
-
mpd_duration = parse_duration(mpd_doc.get('mediaPresentationDuration'))
formats = []
for period in mpd_doc.findall(_add_ns('Period')):
lang = representation_attrib.get('lang')
url_el = representation.find(_add_ns('BaseURL'))
filesize = int_or_none(url_el.attrib.get('{http://youtube.com/yt/2012/10/10}contentLength') if url_el is not None else None)
+ bandwidth = int_or_none(representation_attrib.get('bandwidth'))
f = {
'format_id': '%s-%s' % (mpd_id, representation_id) if mpd_id else representation_id,
'url': base_url,
'ext': mimetype2ext(mime_type),
'width': int_or_none(representation_attrib.get('width')),
'height': int_or_none(representation_attrib.get('height')),
- 'tbr': int_or_none(representation_attrib.get('bandwidth'), 1000),
+ 'tbr': int_or_none(bandwidth, 1000),
'asr': int_or_none(representation_attrib.get('audioSamplingRate')),
'fps': int_or_none(representation_attrib.get('frameRate')),
- 'vcodec': 'none' if content_type == 'audio' else representation_attrib.get('codecs'),
- 'acodec': 'none' if content_type == 'video' else representation_attrib.get('codecs'),
'language': lang if lang not in ('mul', 'und', 'zxx', 'mis') else None,
'format_note': 'DASH %s' % content_type,
'filesize': filesize,
}
+ f.update(parse_codecs(representation_attrib.get('codecs')))
representation_ms_info = extract_multisegment_info(representation, adaption_set_ms_info)
- if 'segment_urls' not in representation_ms_info and 'media_template' in representation_ms_info:
- media_template = representation_ms_info['media_template']
- media_template = media_template.replace('$RepresentationID$', representation_id)
- media_template = re.sub(r'\$(Number|Bandwidth|Time)\$', r'%(\1)d', media_template)
- media_template = re.sub(r'\$(Number|Bandwidth|Time)%([^$]+)\$', r'%(\1)\2', media_template)
- media_template.replace('$$', '$')
+ def prepare_template(template_name, identifiers):
+ t = representation_ms_info[template_name]
+ t = t.replace('$RepresentationID$', representation_id)
+ t = re.sub(r'\$(%s)\$' % '|'.join(identifiers), r'%(\1)d', t)
+ t = re.sub(r'\$(%s)%%([^$]+)\$' % '|'.join(identifiers), r'%(\1)\2', t)
+ t.replace('$$', '$')
+ return t
+
+ # @initialization is a regular template like @media one
+ # so it should be handled just the same way (see
+ # https://github.com/rg3/youtube-dl/issues/11605)
+ if 'initialization' in representation_ms_info:
+ initialization_template = prepare_template(
+ 'initialization',
+ # As per [1, 5.3.9.4.2, Table 15, page 54] $Number$ and
+ # $Time$ shall not be included for @initialization thus
+ # only $Bandwidth$ remains
+ ('Bandwidth', ))
+ representation_ms_info['initialization_url'] = initialization_template % {
+ 'Bandwidth': bandwidth,
+ }
+
+ if 'segment_urls' not in representation_ms_info and 'media' in representation_ms_info:
+
+ media_template = prepare_template('media', ('Number', 'Bandwidth', 'Time'))
# As per [1, 5.3.9.4.4, Table 16, page 55] $Number$ and $Time$
# can't be used at the same time
representation_ms_info['fragments'] = [{
'url': media_template % {
'Number': segment_number,
- 'Bandwidth': int_or_none(representation_attrib.get('bandwidth')),
+ 'Bandwidth': bandwidth,
},
'duration': segment_duration,
} for segment_number in range(
def add_segment_url():
segment_url = media_template % {
'Time': segment_time,
- 'Bandwidth': int_or_none(representation_attrib.get('bandwidth')),
+ 'Bandwidth': bandwidth,
'Number': segment_number,
}
representation_ms_info['fragments'].append({
# Example: https://www.youtube.com/watch?v=iXZV5uAYMJI
# or any YouTube dashsegments video
fragments = []
- s_num = 0
- for segment_url in representation_ms_info['segment_urls']:
- s = representation_ms_info['s'][s_num]
+ segment_index = 0
+ timescale = representation_ms_info['timescale']
+ for s in representation_ms_info['s']:
+ duration = float_or_none(s['d'], timescale)
for r in range(s.get('r', 0) + 1):
fragments.append({
- 'url': segment_url,
- 'duration': float_or_none(s['d'], representation_ms_info['timescale']),
+ 'url': representation_ms_info['segment_urls'][segment_index],
+ 'duration': duration,
})
+ segment_index += 1
representation_ms_info['fragments'] = fragments
# NB: MPD manifest may contain direct URLs to unfragmented media.
# No fragments key is present in this case.
'protocol': 'http_dash_segments',
})
if 'initialization_url' in representation_ms_info:
- initialization_url = representation_ms_info['initialization_url'].replace('$RepresentationID$', representation_id)
+ initialization_url = representation_ms_info['initialization_url']
if not f.get('url'):
f['url'] = initialization_url
f['fragments'].append({'url': initialization_url})
f['fragments'].extend(representation_ms_info['fragments'])
for fragment in f['fragments']:
- fragment['url'] = combine_url(base_url, fragment['url'])
+ fragment['url'] = urljoin(base_url, fragment['url'])
try:
existing_format = next(
fo for fo in formats
})
return formats
- def _parse_html5_media_entries(self, base_url, webpage, video_id, m3u8_id=None, m3u8_entry_protocol='m3u8'):
+ def _parse_html5_media_entries(self, base_url, webpage, video_id, m3u8_id=None, m3u8_entry_protocol='m3u8', mpd_id=None):
def absolute_url(video_url):
return compat_urlparse.urljoin(base_url, video_url)
def _media_formats(src, cur_media_type):
full_url = absolute_url(src)
- if determine_ext(full_url) == 'm3u8':
+ ext = determine_ext(full_url)
+ if ext == 'm3u8':
is_plain_url = False
formats = self._extract_m3u8_formats(
full_url, video_id, ext='mp4',
entry_protocol=m3u8_entry_protocol, m3u8_id=m3u8_id)
+ elif ext == 'mpd':
+ is_plain_url = False
+ formats = self._extract_mpd_formats(
+ full_url, video_id, mpd_id=mpd_id)
else:
is_plain_url = True
formats = [{
media_tags = [(media_tag, media_type, '')
for media_tag, media_type
in re.findall(r'(?s)(<(video|audio)[^>]*/>)', webpage)]
- media_tags.extend(re.findall(r'(?s)(<(?P<tag>video|audio)[^>]*>)(.*?)</(?P=tag)>', webpage))
+ media_tags.extend(re.findall(
+ # We only allow video|audio followed by a whitespace or '>'.
+ # Allowing more characters may end up in significant slow down (see
+ # https://github.com/rg3/youtube-dl/issues/11979, example URL:
+ # http://www.porntrex.com/maps/videositemap.xml).
+ r'(?s)(<(?P<tag>video|audio)(?:\s+[^>]*)?>)(.*?)</(?P=tag)>', webpage))
for media_tag, media_type, media_content in media_tags:
media_info = {
'formats': [],
entries.append(media_info)
return entries
- def _extract_akamai_formats(self, manifest_url, video_id):
+ def _extract_akamai_formats(self, manifest_url, video_id, hosts={}):
formats = []
hdcore_sign = 'hdcore=3.7.0'
- f4m_url = re.sub(r'(https?://.+?)/i/', r'\1/z/', manifest_url).replace('/master.m3u8', '/manifest.f4m')
+ f4m_url = re.sub(r'(https?://[^/+])/i/', r'\1/z/', manifest_url).replace('/master.m3u8', '/manifest.f4m')
+ hds_host = hosts.get('hds')
+ if hds_host:
+ f4m_url = re.sub(r'(https?://)[^/]+', r'\1' + hds_host, f4m_url)
if 'hdcore=' not in f4m_url:
f4m_url += ('&' if '?' in f4m_url else '?') + hdcore_sign
f4m_formats = self._extract_f4m_formats(
for entry in f4m_formats:
entry.update({'extra_param_to_segment_url': hdcore_sign})
formats.extend(f4m_formats)
- m3u8_url = re.sub(r'(https?://.+?)/z/', r'\1/i/', manifest_url).replace('/manifest.f4m', '/master.m3u8')
+ m3u8_url = re.sub(r'(https?://[^/]+)/z/', r'\1/i/', manifest_url).replace('/manifest.f4m', '/master.m3u8')
+ hls_host = hosts.get('hls')
+ if hls_host:
+ m3u8_url = re.sub(r'(https?://)[^/]+', r'\1' + hls_host, m3u8_url)
formats.extend(self._extract_m3u8_formats(
m3u8_url, video_id, 'mp4', 'm3u8_native',
m3u8_id='hls', fatal=False))
'id': '5u5n1',
'ext': 'mp4',
'title': 'The Matrix Moonwalk',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'duration': 4.6,
'timestamp': 1428527772,
'upload_date': '20150408',
class CrackleIE(InfoExtractor):
- _VALID_URL = r'(?:crackle:|https?://(?:www\.)?crackle\.com/(?:playlist/\d+/|(?:[^/]+/)+))(?P<id>\d+)'
+ _VALID_URL = r'(?:crackle:|https?://(?:(?:www|m)\.)?crackle\.com/(?:playlist/\d+/|(?:[^/]+/)+))(?P<id>\d+)'
_TEST = {
'url': 'http://www.crackle.com/comedians-in-cars-getting-coffee/2498934',
'info_dict': {
'ext': 'mp4',
'title': 'Everybody Respects A Bloody Nose',
'description': 'Jerry is kaffeeklatsching in L.A. with funnyman J.B. Smoove (Saturday Night Live, Real Husbands of Hollywood). They’re headed for brew at 10 Speed Coffee in a 1964 Studebaker Avanti.',
- 'thumbnail': 're:^https?://.*\.jpg',
+ 'thumbnail': r're:^https?://.*\.jpg',
'duration': 906,
'series': 'Comedians In Cars Getting Coffee',
'season_number': 8,
}
}
+ _THUMBNAIL_RES = [
+ (120, 90),
+ (208, 156),
+ (220, 124),
+ (220, 220),
+ (240, 180),
+ (250, 141),
+ (315, 236),
+ (320, 180),
+ (360, 203),
+ (400, 300),
+ (421, 316),
+ (460, 330),
+ (460, 460),
+ (462, 260),
+ (480, 270),
+ (587, 330),
+ (640, 480),
+ (700, 330),
+ (700, 394),
+ (854, 480),
+ (1024, 1024),
+ (1920, 1080),
+ ]
+
# extracted from http://legacyweb-us.crackle.com/flash/ReferrerRedirect.ashx
- _THUMBNAIL_TEMPLATE = 'http://images-us-am.crackle.com/%stnl_1920x1080.jpg?ts=20140107233116?c=635333335057637614'
_MEDIA_FILE_SLOTS = {
'c544.flv': {
'width': 544,
item = self._download_xml(
'http://legacyweb-us.crackle.com/app/revamp/vidwallcache.aspx?flags=-1&fm=%s' % video_id,
- video_id).find('i')
+ video_id, headers=self.geo_verification_headers()).find('i')
title = item.attrib['t']
subtitles = {}
formats = self._extract_m3u8_formats(
'http://content.uplynk.com/ext/%s/%s.m3u8' % (config_doc.attrib['strUplynkOwnerId'], video_id),
video_id, 'mp4', m3u8_id='hls', fatal=None)
- thumbnail = None
+ thumbnails = []
path = item.attrib.get('p')
if path:
- thumbnail = self._THUMBNAIL_TEMPLATE % path
+ for width, height in self._THUMBNAIL_RES:
+ res = '%dx%d' % (width, height)
+ thumbnails.append({
+ 'id': res,
+ 'url': 'http://images-us-am.crackle.com/%stnl_%s.jpg' % (path, res),
+ 'width': width,
+ 'height': height,
+ 'resolution': res,
+ })
http_base_url = 'http://ahttp.crackle.com/' + path
for mfs_path, mfs_info in self._MEDIA_FILE_SLOTS.items():
formats.append({
if locale and v:
if locale not in subtitles:
subtitles[locale] = []
- subtitles[locale] = [{
- 'url': '%s/%s%s_%s.xml' % (config_doc.attrib['strSubtitleServer'], path, locale, v),
- 'ext': 'ttml',
- }]
+ for url_ext, ext in (('vtt', 'vtt'), ('xml', 'tt')):
+ subtitles.setdefault(locale, []).append({
+ 'url': '%s/%s%s_%s.%s' % (config_doc.attrib['strSubtitleServer'], path, locale, v, url_ext),
+ 'ext': ext,
+ })
self._sort_formats(formats, ('width', 'height', 'tbr', 'format_id'))
return {
'series': item.attrib.get('sn'),
'season_number': int_or_none(item.attrib.get('se')),
'episode_number': int_or_none(item.attrib.get('ep')),
- 'thumbnail': thumbnail,
+ 'thumbnails': thumbnails,
'subtitles': subtitles,
'formats': formats,
}
'ext': 'mp4',
'title': 'Le Samouraï',
'description': 'md5:a2b4b116326558149bef81f76dcbb93f',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
}
}
'ext': 'mp4',
'title': 'Fox & Friends Says Protecting Atheists From Discrimination Is Anti-Christian!',
'description': 'md5:e1a46ad1650e3a5ec7196d432799127f',
- 'thumbnail': 're:^https?://.*\.jpg',
+ 'thumbnail': r're:^https?://.*\.jpg',
'timestamp': 1428207000,
'upload_date': '20150405',
'uploader': 'Heather',
'ext': 'flv',
'title': 'Culture Japan Episode 1 – Rebuilding Japan after the 3.11',
'description': 'md5:2fbc01f90b87e8e9137296f37b461c12',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'uploader': 'Danny Choo Network',
'upload_date': '20120213',
},
'ext': 'mp4',
'title': 'Re:ZERO -Starting Life in Another World- Episode 5 – The Morning of Our Promise Is Still Distant',
'description': 'md5:97664de1ab24bbf77a9c01918cb7dca9',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'uploader': 'TV TOKYO',
'upload_date': '20160508',
},
# m3u8 download
'skip_download': True,
},
+ }, {
+ 'url': 'http://www.crunchyroll.com/konosuba-gods-blessing-on-this-wonderful-world/episode-1-give-me-deliverance-from-this-judicial-injustice-727589',
+ 'info_dict': {
+ 'id': '727589',
+ 'ext': 'mp4',
+ 'title': "KONOSUBA -God's blessing on this wonderful world! 2 Episode 1 – Give Me Deliverance from this Judicial Injustice!",
+ 'description': 'md5:cbcf05e528124b0f3a0a419fc805ea7d',
+ 'thumbnail': r're:^https?://.*\.jpg$',
+ 'uploader': 'Kadokawa Pictures Inc.',
+ 'upload_date': '20170118',
+ 'series': "KONOSUBA -God's blessing on this wonderful world!",
+ 'season_number': 2,
+ 'episode': 'Give Me Deliverance from this Judicial Injustice!',
+ 'episode_number': 1,
+ },
+ 'params': {
+ # m3u8 download
+ 'skip_download': True,
+ },
}, {
'url': 'http://www.crunchyroll.fr/girl-friend-beta/episode-11-goodbye-la-mode-661697',
'only_matching': True,
output += 'WrapStyle: %s\n' % sub_root.attrib['wrap_style']
output += 'PlayResX: %s\n' % sub_root.attrib['play_res_x']
output += 'PlayResY: %s\n' % sub_root.attrib['play_res_y']
- output += """ScaledBorderAndShadow: no
-
+ output += """
[V4+ Styles]
Format: Name, Fontname, Fontsize, PrimaryColour, SecondaryColour, OutlineColour, BackColour, Bold, Italic, Underline, StrikeOut, ScaleX, ScaleY, Spacing, Angle, BorderStyle, Outline, Shadow, Alignment, MarginL, MarginR, MarginV, Encoding
"""
subtitles = self.extract_subtitles(video_id, webpage)
+ # webpage provide more accurate data than series_title from XML
+ series = self._html_search_regex(
+ r'id=["\']showmedia_about_episode_num[^>]+>\s*<a[^>]+>([^<]+)',
+ webpage, 'series', default=xpath_text(metadata, 'series_title'))
+
+ episode = xpath_text(metadata, 'episode_title')
+ episode_number = int_or_none(xpath_text(metadata, 'episode_number'))
+
+ season_number = int_or_none(self._search_regex(
+ r'(?s)<h4[^>]+id=["\']showmedia_about_episode_num[^>]+>.+?</h4>\s*<h4>\s*Season (\d+)',
+ webpage, 'season number', default=None))
+
return {
'id': video_id,
'title': video_title,
'thumbnail': xpath_text(metadata, 'episode_image_url'),
'uploader': video_uploader,
'upload_date': video_upload_date,
- 'series': xpath_text(metadata, 'series_title'),
- 'episode': xpath_text(metadata, 'episode_title'),
- 'episode_number': int_or_none(xpath_text(metadata, 'episode_number')),
+ 'series': series,
+ 'season_number': season_number,
+ 'episode': episode,
+ 'episode_number': episode_number,
'subtitles': subtitles,
'formats': formats,
}
ExtractorError,
)
from .senateisvp import SenateISVPIE
+from .ustream import UstreamIE
class CSpanIE(InfoExtractor):
'md5': '94b29a4f131ff03d23471dd6f60b6a1d',
'info_dict': {
'id': '315139',
- 'ext': 'mp4',
'title': 'Attorney General Eric Holder on Voting Rights Act Decision',
- 'description': 'Attorney General Eric Holder speaks to reporters following the Supreme Court decision in [Shelby County v. Holder], in which the court ruled that the preclearance provisions of the Voting Rights Act could not be enforced.',
},
+ 'playlist_mincount': 2,
'skip': 'Regularly fails on travis, for unknown reasons',
}, {
'url': 'http://www.c-span.org/video/?c4486943/cspan-international-health-care-models',
- 'md5': '8e5fbfabe6ad0f89f3012a7943c1287b',
+ # md5 is unstable
'info_dict': {
'id': 'c4486943',
'ext': 'mp4',
}
}, {
'url': 'http://www.c-span.org/video/?318608-1/gm-ignition-switch-recall',
- 'md5': '2ae5051559169baadba13fc35345ae74',
'info_dict': {
'id': '342759',
- 'ext': 'mp4',
'title': 'General Motors Ignition Switch Recall',
- 'duration': 14848,
- 'description': 'md5:118081aedd24bf1d3b68b3803344e7f3'
},
+ 'playlist_mincount': 6,
}, {
# Video from senate.gov
'url': 'http://www.c-span.org/video/?104517-1/immigration-reforms-needed-protect-skilled-american-workers',
'params': {
'skip_download': True, # m3u8 downloads
}
+ }, {
+ # Ustream embedded video
+ 'url': 'https://www.c-span.org/video/?114917-1/armed-services',
+ 'info_dict': {
+ 'id': '58428542',
+ 'ext': 'flv',
+ 'title': 'USHR07 Armed Services Committee',
+ 'description': 'hsas00-2118-20150204-1000et-07\n\n\nUSHR07 Armed Services Committee',
+ 'timestamp': 1423060374,
+ 'upload_date': '20150204',
+ 'uploader': 'HouseCommittee',
+ 'uploader_id': '12987475',
+ },
}]
def _real_extract(self, url):
video_id = self._match_id(url)
video_type = None
webpage = self._download_webpage(url, video_id)
+
+ ustream_url = UstreamIE._extract_url(webpage)
+ if ustream_url:
+ return self.url_result(ustream_url, UstreamIE.ie_key())
+
# We first look for clipid, because clipprog always appears before
patterns = [r'id=\'clip(%s)\'\s*value=\'([0-9]+)\'' % t for t in ('id', 'prog')]
results = list(filter(None, (re.search(p, webpage) for p in patterns)))
'ext': 'mp4',
'title': '韓國31歲童顏男 貌如十多歲小孩',
'description': '越有年紀的人,越希望看起來年輕一點,而南韓卻有一位31歲的男子,看起來像是11、12歲的小孩,身...',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'timestamp': 1378205880,
'upload_date': '20130903',
}
'ext': 'mp4',
'title': 'iPhone6熱銷 蘋果財報亮眼',
'description': 'md5:f395d4f485487bb0f992ed2c4b07aa7d',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'upload_date': '20150128',
'uploader_id': 'TBSCTS',
'uploader': '中華電視公司',
class CTVNewsIE(InfoExtractor):
- _VALID_URL = r'https?://(?:www\.)?ctvnews\.ca/(?:video\?(?:clip|playlist|bin)Id=|.*?)(?P<id>[0-9.]+)'
+ _VALID_URL = r'https?://(?:.+?\.)?ctvnews\.ca/(?:video\?(?:clip|playlist|bin)Id=|.*?)(?P<id>[0-9.]+)'
_TESTS = [{
'url': 'http://www.ctvnews.ca/video?clipId=901995',
'md5': '10deb320dc0ccb8d01d34d12fc2ea672',
}, {
'url': 'http://www.ctvnews.ca/canadiens-send-p-k-subban-to-nashville-in-blockbuster-trade-1.2967231',
'only_matching': True,
+ }, {
+ 'url': 'http://vancouverisland.ctvnews.ca/video?clipId=761241',
+ 'only_matching': True,
}]
def _real_extract(self, url):
'ext': 'mp4',
'title': 'The Next, Best West',
'description': 'md5:0423cd00833dea1519cf014e9d0903b1',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'creator': 'Coldstream Creative',
'duration': 2203,
'view_count': int,
'ext': 'mp4',
'title': 'Steam Machine Models, Pricing Listed on Steam Store - IGN News',
'description': 'Several come bundled with the Steam Controller.',
- 'thumbnail': 're:^https?:.*\.(?:jpg|png)$',
+ 'thumbnail': r're:^https?:.*\.(?:jpg|png)$',
'duration': 74,
'timestamp': 1425657362,
'upload_date': '20150306',
'title': '마크 헌트 vs 안토니오 실바',
'description': 'Mark Hunt vs Antonio Silva',
'upload_date': '20131217',
- 'thumbnail': 're:^https?://.*\.(?:jpg|png)',
+ 'thumbnail': r're:^https?://.*\.(?:jpg|png)',
'duration': 2117,
'view_count': int,
'comment_count': int,
'title': '1297회, \'아빠 아들로 태어나길 잘 했어\' 민수, 감동의 눈물[아빠 어디가] 20150118',
'description': 'md5:79794514261164ff27e36a21ad229fc5',
'upload_date': '20150604',
- 'thumbnail': 're:^https?://.*\.(?:jpg|png)',
+ 'thumbnail': r're:^https?://.*\.(?:jpg|png)',
'duration': 154,
'view_count': int,
'comment_count': int,
'title': '01-Korean War ( Trouble on the horizon )',
'description': '\nKorean War 01\nTrouble on the horizon\n전쟁의 먹구름',
'upload_date': '20080223',
- 'thumbnail': 're:^https?://.*\.(?:jpg|png)',
+ 'thumbnail': r're:^https?://.*\.(?:jpg|png)',
'duration': 249,
'view_count': int,
'comment_count': int,
'title': 'DOTA 2GETHER 시즌2 6회 - 2부',
'description': 'DOTA 2GETHER 시즌2 6회 - 2부',
'upload_date': '20130831',
- 'thumbnail': 're:^https?://.*\.(?:jpg|png)',
+ 'thumbnail': r're:^https?://.*\.(?:jpg|png)',
'duration': 3868,
'view_count': int,
},
'ext': 'mp4',
'title': 'Skulle teste ut fornøyelsespark, men kollegaen var bare opptatt av bikinikroppen',
'description': 'md5:1504a54606c4dde3e4e61fc97aa857e0',
- 'thumbnail': 're:https?://.*\.jpg',
+ 'thumbnail': r're:https?://.*\.jpg',
'timestamp': 1404039863,
'upload_date': '20140629',
'duration': 69.544,
'title': 'Videoinstallation für eine Kaufhausfassade',
'description': 'Kurzfilm',
'upload_date': '20110407',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
},
}
'id': '176747451',
'title': 'Best!',
'uploader': 'Anonymous',
- 'thumbnail': 're:^https?://cdn-images.deezer.com/images/cover/.*\.jpg$',
+ 'thumbnail': r're:^https?://cdn-images.deezer.com/images/cover/.*\.jpg$',
},
'playlist_count': 30,
'skip': 'Only available in .de',
'title': 'MARSHALL PLAN AT WORK IN WESTERN GERMANY, THE',
'description': 'md5:1fabd480c153f97b07add61c44407c82',
'duration': 660,
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
},
}, {
'url': 'http://www.dhm.de/filmarchiv/02-mapping-the-wall/peter-g/rolle-1/',
'id': 'rolle-1',
'ext': 'flv',
'title': 'ROLLE 1',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
},
}]
'id': 's8uk0r',
'ext': 'mp4',
'title': 'Loi sur la fin de vie: le texte prévoit un renforcement des directives anticipées',
- 'thumbnail': 're:^https?://.*\.jpg',
+ 'thumbnail': r're:^https?://.*\.jpg',
'duration': 74,
'upload_date': '20150317',
'timestamp': 1426604939,
'id': 'xvpfp8',
'ext': 'mp4',
'title': 'Two - C\'est La Vie (clip)',
- 'thumbnail': 're:^https?://.*\.jpg',
+ 'thumbnail': r're:^https?://.*\.jpg',
'duration': 233,
'upload_date': '20150224',
'timestamp': 1424760500,
extract_attributes,
int_or_none,
parse_age_limit,
- unescapeHTML,
ExtractorError,
)
webpage, 'video container'))
video = self._parse_json(
- unescapeHTML(container.get('data-video') or container.get('data-json')),
+ container.get('data-video') or container.get('data-json'),
display_id)
title = video['name']
--- /dev/null
+# coding: utf-8
+from __future__ import unicode_literals
+
+import re
+
+from .common import InfoExtractor
+from ..utils import (
+ int_or_none,
+ unified_strdate,
+ compat_str,
+ determine_ext,
+)
+
+
+class DisneyIE(InfoExtractor):
+ _VALID_URL = r'''(?x)
+ https?://(?P<domain>(?:[^/]+\.)?(?:disney\.[a-z]{2,3}(?:\.[a-z]{2})?|disney(?:(?:me|latino)\.com|turkiye\.com\.tr)|starwars\.com))/(?:embed/|(?:[^/]+/)+[\w-]+-)(?P<id>[a-z0-9]{24})'''
+ _TESTS = [{
+ 'url': 'http://video.disney.com/watch/moana-trailer-545ed1857afee5a0ec239977',
+ 'info_dict': {
+ 'id': '545ed1857afee5a0ec239977',
+ 'ext': 'mp4',
+ 'title': 'Moana - Trailer',
+ 'description': 'A fun adventure for the entire Family! Bring home Moana on Digital HD Feb 21 & Blu-ray March 7',
+ 'upload_date': '20170112',
+ },
+ 'params': {
+ # m3u8 download
+ 'skip_download': True,
+ }
+ }, {
+ 'url': 'http://videos.disneylatino.com/ver/spider-man-de-regreso-a-casa-primer-adelanto-543a33a1850bdcfcca13bae2',
+ 'only_matching': True,
+ }, {
+ 'url': 'http://video.en.disneyme.com/watch/future-worm/robo-carp-2001-544b66002aa7353cdd3f5114',
+ 'only_matching': True,
+ }, {
+ 'url': 'http://video.disneyturkiye.com.tr/izle/7c-7-cuceler/kimin-sesi-zaten-5456f3d015f6b36c8afdd0e2',
+ 'only_matching': True,
+ }, {
+ 'url': 'http://disneyjunior.disney.com/embed/546a4798ddba3d1612e4005d',
+ 'only_matching': True,
+ }, {
+ 'url': 'http://www.starwars.com/embed/54690d1e6c42e5f09a0fb097',
+ 'only_matching': True,
+ }]
+
+ def _real_extract(self, url):
+ domain, video_id = re.match(self._VALID_URL, url).groups()
+ webpage = self._download_webpage(
+ 'http://%s/embed/%s' % (domain, video_id), video_id)
+ video_data = self._parse_json(self._search_regex(
+ r'Disney\.EmbedVideo=({.+});', webpage, 'embed data'), video_id)['video']
+
+ for external in video_data.get('externals', []):
+ if external.get('source') == 'vevo':
+ return self.url_result('vevo:' + external['data_id'], 'Vevo')
+
+ title = video_data['title']
+
+ formats = []
+ for flavor in video_data.get('flavors', []):
+ flavor_format = flavor.get('format')
+ flavor_url = flavor.get('url')
+ if not flavor_url or not re.match(r'https?://', flavor_url):
+ continue
+ tbr = int_or_none(flavor.get('bitrate'))
+ if tbr == 99999:
+ formats.extend(self._extract_m3u8_formats(
+ flavor_url, video_id, 'mp4', m3u8_id=flavor_format, fatal=False))
+ continue
+ format_id = []
+ if flavor_format:
+ format_id.append(flavor_format)
+ if tbr:
+ format_id.append(compat_str(tbr))
+ ext = determine_ext(flavor_url)
+ if flavor_format == 'applehttp' or ext == 'm3u8':
+ ext = 'mp4'
+ width = int_or_none(flavor.get('width'))
+ height = int_or_none(flavor.get('height'))
+ formats.append({
+ 'format_id': '-'.join(format_id),
+ 'url': flavor_url,
+ 'width': width,
+ 'height': height,
+ 'tbr': tbr,
+ 'ext': ext,
+ 'vcodec': 'none' if (width == 0 and height == 0) else None,
+ })
+ self._sort_formats(formats)
+
+ subtitles = {}
+ for caption in video_data.get('captions', []):
+ caption_url = caption.get('url')
+ caption_format = caption.get('format')
+ if not caption_url or caption_format.startswith('unknown'):
+ continue
+ subtitles.setdefault(caption.get('language', 'en'), []).append({
+ 'url': caption_url,
+ 'ext': {
+ 'webvtt': 'vtt',
+ }.get(caption_format, caption_format),
+ })
+
+ return {
+ 'id': video_id,
+ 'title': title,
+ 'description': video_data.get('description') or video_data.get('short_desc'),
+ 'thumbnail': video_data.get('thumb') or video_data.get('thumb_secure'),
+ 'duration': int_or_none(video_data.get('duration_sec')),
+ 'upload_date': unified_strdate(video_data.get('publish_date')),
+ 'formats': formats,
+ 'subtitles': subtitles,
+ }
class DouyuTVIE(InfoExtractor):
IE_DESC = '斗鱼'
- _VALID_URL = r'https?://(?:www\.)?douyu(?:tv)?\.com/(?P<id>[A-Za-z0-9]+)'
+ _VALID_URL = r'https?://(?:www\.)?douyu(?:tv)?\.com/(?:[^/]+/)*(?P<id>[A-Za-z0-9]+)'
_TESTS = [{
'url': 'http://www.douyutv.com/iseven',
'info_dict': {
'display_id': 'iseven',
'ext': 'flv',
'title': 're:^清晨醒脑!T-ara根本停不下来! [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}$',
- 'description': 're:.*m7show@163\.com.*',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'description': r're:.*m7show@163\.com.*',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'uploader': '7师傅',
'is_live': True,
},
'ext': 'flv',
'title': 're:^小漠从零单排记!——CSOL2躲猫猫 [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}$',
'description': 'md5:746a2f7a253966a06755a912f0acc0d2',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'uploader': 'douyu小漠',
'is_live': True,
},
'display_id': '17732',
'ext': 'flv',
'title': 're:^清晨醒脑!T-ara根本停不下来! [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}$',
- 'description': 're:.*m7show@163\.com.*',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'description': r're:.*m7show@163\.com.*',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'uploader': '7师傅',
'is_live': True,
},
}, {
'url': 'http://www.douyu.com/xiaocang',
'only_matching': True,
+ }, {
+ # \"room_id\"
+ 'url': 'http://www.douyu.com/t/lpl',
+ 'only_matching': True,
}]
# Decompile core.swf in webpage by ffdec "Search SWFs in memory". core.swf
else:
page = self._download_webpage(url, video_id)
room_id = self._html_search_regex(
- r'"room_id"\s*:\s*(\d+),', page, 'room id')
+ r'"room_id\\?"\s*:\s*(\d+),', page, 'room id')
room = self._download_json(
'http://m.douyu.com/html5/live?roomId=%s' % room_id, video_id,
from .common import InfoExtractor
from ..compat import compat_urlparse
from ..utils import (
+ USER_AGENTS,
int_or_none,
update_url_query,
)
manifest_url, video_id, ext='mp4',
entry_protocol='m3u8_native', m3u8_id=protocol, fatal=False)
# Sometimes final URLs inside m3u8 are unsigned, let's fix this
- # ourselves
+ # ourselves. Also fragments' URLs are only served signed for
+ # Safari user agent.
query = compat_urlparse.parse_qs(compat_urlparse.urlparse(manifest_url).query)
for m3u8_format in m3u8_formats:
- m3u8_format['url'] = update_url_query(m3u8_format['url'], query)
+ m3u8_format.update({
+ 'url': update_url_query(m3u8_format['url'], query),
+ 'http_headers': {
+ 'User-Agent': USER_AGENTS['Safari'],
+ },
+ })
formats.extend(m3u8_formats)
elif protocol == 'hds':
formats.extend(self._extract_f4m_formats(
class DramaFeverIE(DramaFeverBaseIE):
IE_NAME = 'dramafever'
- _VALID_URL = r'https?://(?:www\.)?dramafever\.com/drama/(?P<id>[0-9]+/[0-9]+)(?:/|$)'
+ _VALID_URL = r'https?://(?:www\.)?dramafever\.com/(?:[^/]+/)?drama/(?P<id>[0-9]+/[0-9]+)(?:/|$)'
_TESTS = [{
'url': 'http://www.dramafever.com/drama/4512/1/Cooking_with_Shin/',
'info_dict': {
'description': 'md5:a8eec7942e1664a6896fcd5e1287bfd0',
'episode': 'Episode 1',
'episode_number': 1,
- 'thumbnail': 're:^https?://.*\.jpg',
+ 'thumbnail': r're:^https?://.*\.jpg',
'timestamp': 1404336058,
'upload_date': '20140702',
'duration': 343,
'description': 'md5:3ff2ee8fedaef86e076791c909cf2e91',
'episode': 'Mnet Asian Music Awards 2015 - Part 3',
'episode_number': 4,
- 'thumbnail': 're:^https?://.*\.jpg',
+ 'thumbnail': r're:^https?://.*\.jpg',
'timestamp': 1450213200,
'upload_date': '20151215',
'duration': 5602,
# m3u8 download
'skip_download': True,
},
+ }, {
+ 'url': 'https://www.dramafever.com/zh-cn/drama/4972/15/Doctor_Romantic/',
+ 'only_matching': True,
}]
def _real_extract(self, url):
class DramaFeverSeriesIE(DramaFeverBaseIE):
IE_NAME = 'dramafever:series'
- _VALID_URL = r'https?://(?:www\.)?dramafever\.com/drama/(?P<id>[0-9]+)(?:/(?:(?!\d+(?:/|$)).+)?)?$'
+ _VALID_URL = r'https?://(?:www\.)?dramafever\.com/(?:[^/]+/)?drama/(?P<id>[0-9]+)(?:/(?:(?!\d+(?:/|$)).+)?)?$'
_TESTS = [{
'url': 'http://www.dramafever.com/drama/4512/Cooking_with_Shin/',
'info_dict': {
'ext': 'mp4',
'title': 'Talkshowet - Leonard Cohen',
'description': 'md5:8f34194fb30cd8c8a30ad8b27b70c0ca',
- 'thumbnail': 're:^https?://.*\.(?:gif|jpg)$',
+ 'thumbnail': r're:^https?://.*\.(?:gif|jpg)$',
'timestamp': 1295537932,
'upload_date': '20110120',
'duration': 3664,
'ext': 'mp3',
'title': 'EM fodbold 1992 Danmark - Tyskland finale Transmission',
'description': 'md5:501e5a195749480552e214fbbed16c4e',
- 'thumbnail': 're:^https?://.*\.(?:gif|jpg)$',
+ 'thumbnail': r're:^https?://.*\.(?:gif|jpg)$',
'timestamp': 1223274900,
'upload_date': '20081006',
'duration': 7369,
import re
-from .zdf import ZDFIE
+from .common import InfoExtractor
+from ..utils import (
+ int_or_none,
+ unified_strdate,
+ xpath_text,
+ determine_ext,
+ qualities,
+ float_or_none,
+ ExtractorError,
+)
-class DreiSatIE(ZDFIE):
+class DreiSatIE(InfoExtractor):
IE_NAME = '3sat'
_VALID_URL = r'(?:https?://)?(?:www\.)?3sat\.de/mediathek/(?:index\.php|mediathek\.php)?\?(?:(?:mode|display)=[^&]+&)*obj=(?P<id>[0-9]+)$'
_TESTS = [
},
]
+ def _parse_smil_formats(self, smil, smil_url, video_id, namespace=None, f4m_params=None, transform_rtmp_url=None):
+ param_groups = {}
+ for param_group in smil.findall(self._xpath_ns('./head/paramGroup', namespace)):
+ group_id = param_group.attrib.get(self._xpath_ns('id', 'http://www.w3.org/XML/1998/namespace'))
+ params = {}
+ for param in param_group:
+ params[param.get('name')] = param.get('value')
+ param_groups[group_id] = params
+
+ formats = []
+ for video in smil.findall(self._xpath_ns('.//video', namespace)):
+ src = video.get('src')
+ if not src:
+ continue
+ bitrate = float_or_none(video.get('system-bitrate') or video.get('systemBitrate'), 1000)
+ group_id = video.get('paramGroup')
+ param_group = param_groups[group_id]
+ for proto in param_group['protocols'].split(','):
+ formats.append({
+ 'url': '%s://%s' % (proto, param_group['host']),
+ 'app': param_group['app'],
+ 'play_path': src,
+ 'ext': 'flv',
+ 'format_id': '%s-%d' % (proto, bitrate),
+ 'tbr': bitrate,
+ })
+ self._sort_formats(formats)
+ return formats
+
+ def extract_from_xml_url(self, video_id, xml_url):
+ doc = self._download_xml(
+ xml_url, video_id,
+ note='Downloading video info',
+ errnote='Failed to download video info')
+
+ status_code = doc.find('./status/statuscode')
+ if status_code is not None and status_code.text != 'ok':
+ code = status_code.text
+ if code == 'notVisibleAnymore':
+ message = 'Video %s is not available' % video_id
+ else:
+ message = '%s returned error: %s' % (self.IE_NAME, code)
+ raise ExtractorError(message, expected=True)
+
+ title = doc.find('.//information/title').text
+ description = xpath_text(doc, './/information/detail', 'description')
+ duration = int_or_none(xpath_text(doc, './/details/lengthSec', 'duration'))
+ uploader = xpath_text(doc, './/details/originChannelTitle', 'uploader')
+ uploader_id = xpath_text(doc, './/details/originChannelId', 'uploader id')
+ upload_date = unified_strdate(xpath_text(doc, './/details/airtime', 'upload date'))
+
+ def xml_to_thumbnails(fnode):
+ thumbnails = []
+ for node in fnode:
+ thumbnail_url = node.text
+ if not thumbnail_url:
+ continue
+ thumbnail = {
+ 'url': thumbnail_url,
+ }
+ if 'key' in node.attrib:
+ m = re.match('^([0-9]+)x([0-9]+)$', node.attrib['key'])
+ if m:
+ thumbnail['width'] = int(m.group(1))
+ thumbnail['height'] = int(m.group(2))
+ thumbnails.append(thumbnail)
+ return thumbnails
+
+ thumbnails = xml_to_thumbnails(doc.findall('.//teaserimages/teaserimage'))
+
+ format_nodes = doc.findall('.//formitaeten/formitaet')
+ quality = qualities(['veryhigh', 'high', 'med', 'low'])
+
+ def get_quality(elem):
+ return quality(xpath_text(elem, 'quality'))
+ format_nodes.sort(key=get_quality)
+ format_ids = []
+ formats = []
+ for fnode in format_nodes:
+ video_url = fnode.find('url').text
+ is_available = 'http://www.metafilegenerator' not in video_url
+ if not is_available:
+ continue
+ format_id = fnode.attrib['basetype']
+ quality = xpath_text(fnode, './quality', 'quality')
+ format_m = re.match(r'''(?x)
+ (?P<vcodec>[^_]+)_(?P<acodec>[^_]+)_(?P<container>[^_]+)_
+ (?P<proto>[^_]+)_(?P<index>[^_]+)_(?P<indexproto>[^_]+)
+ ''', format_id)
+
+ ext = determine_ext(video_url, None) or format_m.group('container')
+ if ext not in ('smil', 'f4m', 'm3u8'):
+ format_id = format_id + '-' + quality
+ if format_id in format_ids:
+ continue
+
+ if ext == 'meta':
+ continue
+ elif ext == 'smil':
+ formats.extend(self._extract_smil_formats(
+ video_url, video_id, fatal=False))
+ elif ext == 'm3u8':
+ # the certificates are misconfigured (see
+ # https://github.com/rg3/youtube-dl/issues/8665)
+ if video_url.startswith('https://'):
+ continue
+ formats.extend(self._extract_m3u8_formats(
+ video_url, video_id, 'mp4', m3u8_id=format_id, fatal=False))
+ elif ext == 'f4m':
+ formats.extend(self._extract_f4m_formats(
+ video_url, video_id, f4m_id=format_id, fatal=False))
+ else:
+ proto = format_m.group('proto').lower()
+
+ abr = int_or_none(xpath_text(fnode, './audioBitrate', 'abr'), 1000)
+ vbr = int_or_none(xpath_text(fnode, './videoBitrate', 'vbr'), 1000)
+
+ width = int_or_none(xpath_text(fnode, './width', 'width'))
+ height = int_or_none(xpath_text(fnode, './height', 'height'))
+
+ filesize = int_or_none(xpath_text(fnode, './filesize', 'filesize'))
+
+ format_note = ''
+ if not format_note:
+ format_note = None
+
+ formats.append({
+ 'format_id': format_id,
+ 'url': video_url,
+ 'ext': ext,
+ 'acodec': format_m.group('acodec'),
+ 'vcodec': format_m.group('vcodec'),
+ 'abr': abr,
+ 'vbr': vbr,
+ 'width': width,
+ 'height': height,
+ 'filesize': filesize,
+ 'format_note': format_note,
+ 'protocol': proto,
+ '_available': is_available,
+ })
+ format_ids.append(format_id)
+
+ self._sort_formats(formats)
+
+ return {
+ 'id': video_id,
+ 'title': title,
+ 'description': description,
+ 'duration': duration,
+ 'thumbnails': thumbnails,
+ 'uploader': uploader,
+ 'uploader_id': uploader_id,
+ 'upload_date': upload_date,
+ 'formats': formats,
+ }
+
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
'like_count': int,
'comment_count': int,
'categories': ['Babe', 'Blonde', 'Erotic', 'Outdoor', 'Softcore', 'Solo'],
- 'thumbnail': 're:https?://.*\.jpg$',
+ 'thumbnail': r're:https?://.*\.jpg$',
'age_limit': 18,
}
}, {
mimetype2ext,
parse_iso8601,
remove_end,
+ update_url_query,
)
class DRTVIE(InfoExtractor):
- _VALID_URL = r'https?://(?:www\.)?dr\.dk/(?:tv/se|nyheder)/(?:[^/]+/)*(?P<id>[\da-z-]+)(?:[/#?]|$)'
-
+ _VALID_URL = r'https?://(?:www\.)?dr\.dk/(?:tv/se|nyheder|radio/ondemand)/(?:[^/]+/)*(?P<id>[\da-z-]+)(?:[/#?]|$)'
+ IE_NAME = 'drtv'
_TESTS = [{
'url': 'https://www.dr.dk/tv/se/boern/ultra/klassen-ultra/klassen-darlig-taber-10',
'md5': '25e659cccc9a2ed956110a299fdf5983',
subtitles = {}
for asset in data['Assets']:
- if asset.get('Kind') == 'Image':
+ kind = asset.get('Kind')
+ if kind == 'Image':
thumbnail = asset.get('Uri')
- elif asset.get('Kind') == 'VideoResource':
+ elif kind in ('VideoResource', 'AudioResource'):
duration = float_or_none(asset.get('DurationInMilliseconds'), 1000)
restricted_to_denmark = asset.get('RestrictedToDenmark')
spoken_subtitles = asset.get('Target') == 'SpokenSubtitles'
preference = -1
format_id += '-spoken-subtitles'
if target == 'HDS':
- formats.extend(self._extract_f4m_formats(
+ f4m_formats = self._extract_f4m_formats(
uri + '?hdcore=3.3.0&plugin=aasp-3.3.0.99.43',
- video_id, preference, f4m_id=format_id))
+ video_id, preference, f4m_id=format_id)
+ if kind == 'AudioResource':
+ for f in f4m_formats:
+ f['vcodec'] = 'none'
+ formats.extend(f4m_formats)
elif target == 'HLS':
formats.extend(self._extract_m3u8_formats(
uri, video_id, 'mp4', entry_protocol='m3u8_native',
'format_id': format_id,
'tbr': int_or_none(bitrate),
'ext': link.get('FileFormat'),
+ 'vcodec': 'none' if kind == 'AudioResource' else None,
})
subtitles_list = asset.get('SubtitlesList')
if isinstance(subtitles_list, list):
'formats': formats,
'subtitles': subtitles,
}
+
+
+class DRTVLiveIE(InfoExtractor):
+ IE_NAME = 'drtv:live'
+ _VALID_URL = r'https?://(?:www\.)?dr\.dk/(?:tv|TV)/live/(?P<id>[\da-z-]+)'
+ _TEST = {
+ 'url': 'https://www.dr.dk/tv/live/dr1',
+ 'info_dict': {
+ 'id': 'dr1',
+ 'ext': 'mp4',
+ 'title': 're:^DR1 [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}$',
+ },
+ 'params': {
+ # m3u8 download
+ 'skip_download': True,
+ },
+ }
+
+ def _real_extract(self, url):
+ channel_id = self._match_id(url)
+ channel_data = self._download_json(
+ 'https://www.dr.dk/mu-online/api/1.0/channel/' + channel_id,
+ channel_id)
+ title = self._live_title(channel_data['Title'])
+
+ formats = []
+ for streaming_server in channel_data.get('StreamingServers', []):
+ server = streaming_server.get('Server')
+ if not server:
+ continue
+ link_type = streaming_server.get('LinkType')
+ for quality in streaming_server.get('Qualities', []):
+ for stream in quality.get('Streams', []):
+ stream_path = stream.get('Stream')
+ if not stream_path:
+ continue
+ stream_url = update_url_query(
+ '%s/%s' % (server, stream_path), {'b': ''})
+ if link_type == 'HLS':
+ formats.extend(self._extract_m3u8_formats(
+ stream_url, channel_id, 'mp4',
+ m3u8_id=link_type, fatal=False, live=True))
+ elif link_type == 'HDS':
+ formats.extend(self._extract_f4m_formats(update_url_query(
+ '%s/%s' % (server, stream_path), {'hdcore': '3.7.0'}),
+ channel_id, f4m_id=link_type, fatal=False))
+ self._sort_formats(formats)
+
+ return {
+ 'id': channel_id,
+ 'title': title,
+ 'thumbnail': channel_data.get('PrimaryImageUri'),
+ 'formats': formats,
+ 'is_live': True,
+ }
'ext': 'mp4',
'title': 'Ik heb nieuws voor je',
'description': 'Niet schrikken hoor',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
}
}, {
'url': 'http://www.dumpert.nl/embed/6675421/dc440fe7/',
'ext': 'mp4',
'title': 'Навальный вышел на свободу',
'description': 'md5:d97861ac9ae77377f3f20eaf9d04b4f5',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'duration': 87,
'view_count': int,
'age_limit': 0,
'id': '12820',
'ext': 'mp4',
'title': "'O Sole Mio",
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'duration': 216,
'view_count': int,
},
--- /dev/null
+# coding: utf-8
+from __future__ import unicode_literals
+
+import re
+
+from .common import InfoExtractor
+
+
+class EggheadCourseIE(InfoExtractor):
+ IE_DESC = 'egghead.io course'
+ IE_NAME = 'egghead:course'
+ _VALID_URL = r'https://egghead\.io/courses/(?P<id>[a-zA-Z_0-9-]+)'
+ _TEST = {
+ 'url': 'https://egghead.io/courses/professor-frisby-introduces-composable-functional-javascript',
+ 'playlist_count': 29,
+ 'info_dict': {
+ 'id': 'professor-frisby-introduces-composable-functional-javascript',
+ 'title': 'Professor Frisby Introduces Composable Functional JavaScript',
+ 'description': 're:(?s)^This course teaches the ubiquitous.*You\'ll start composing functionality before you know it.$',
+ },
+ }
+
+ def _real_extract(self, url):
+ playlist_id = self._match_id(url)
+ webpage = self._download_webpage(url, playlist_id)
+
+ title = self._html_search_regex(r'<h1 class="title">([^<]+)</h1>', webpage, 'title')
+ ul = self._search_regex(r'(?s)<ul class="series-lessons-list">(.*?)</ul>', webpage, 'session list')
+
+ found = re.findall(r'(?s)<a class="[^"]*"\s*href="([^"]+)">\s*<li class="item', ul)
+ entries = [self.url_result(m) for m in found]
+
+ return {
+ '_type': 'playlist',
+ 'id': playlist_id,
+ 'title': title,
+ 'description': self._og_search_description(webpage),
+ 'entries': entries,
+ }
'id': '2447',
'ext': 'mp4',
'title': 'Ek Villain',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'description': 'md5:9d29fc91a7abadd4591fb862fa560d93',
}
},
'id': '1671',
'ext': 'mp4',
'title': 'Soodhu Kavvuum',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'description': 'md5:b40f2bf7320b4f9414f3780817b2af8c',
}
},
from __future__ import unicode_literals
from .common import InfoExtractor
-from ..utils import unified_strdate
+from ..utils import strip_jsonp, unified_strdate
class ElPaisIE(InfoExtractor):
'description': 'Que sí, que las cápsulas son cómodas. Pero si le pides algo más a la vida, quizá deberías aprender a usar bien la cafetera italiana. No tienes más que ver este vídeo y seguir sus siete normas básicas.',
'upload_date': '20160303',
}
+ }, {
+ 'url': 'http://elpais.com/elpais/2017/01/26/ciencia/1485456786_417876.html',
+ 'md5': '9c79923a118a067e1a45789e1e0b0f9c',
+ 'info_dict': {
+ 'id': '1485456786_417876',
+ 'ext': 'mp4',
+ 'title': 'Hallado un barco de la antigua Roma que naufragó en Baleares hace 1.800 años',
+ 'description': 'La nave portaba cientos de ánforas y se hundió cerca de la isla de Cabrera por razones desconocidas',
+ 'upload_date': '20170127',
+ },
}]
def _real_extract(self, url):
prefix = self._html_search_regex(
r'var\s+url_cache\s*=\s*"([^"]+)";', webpage, 'URL prefix')
- video_suffix = self._search_regex(
- r"(?:URLMediaFile|urlVideo_\d+)\s*=\s*url_cache\s*\+\s*'([^']+)'", webpage, 'video URL')
+ id_multimedia = self._search_regex(
+ r"id_multimedia\s*=\s*'([^']+)'", webpage, 'ID multimedia', default=None)
+ if id_multimedia:
+ url_info = self._download_json(
+ 'http://elpais.com/vdpep/1/?pepid=' + id_multimedia, video_id, transform_source=strip_jsonp)
+ video_suffix = url_info['mp4']
+ else:
+ video_suffix = self._search_regex(
+ r"(?:URLMediaFile|urlVideo_\d+)\s*=\s*url_cache\s*\+\s*'([^']+)'", webpage, 'video URL')
video_url = prefix + video_suffix
thumbnail_suffix = self._search_regex(
r"(?:URLMediaStill|urlFotogramaFijo_\d+)\s*=\s*url_cache\s*\+\s*'([^']+)'",
'display_id': 'sexy-babe-softcore',
'ext': 'm4v',
'title': 'sexy babe softcore',
- 'thumbnail': 're:https?://.*\.jpg',
+ 'thumbnail': r're:https?://.*\.jpg',
'age_limit': 18,
}
}, {
'id': '1133519',
'ext': 'm4v',
'title': 'Try It On Pee_cut_2.wmv - 4shared.com - file sharing - download movie file',
- 'thumbnail': 're:https?://.*\.jpg',
+ 'thumbnail': r're:https?://.*\.jpg',
'age_limit': 18,
},
'skip': 'Requires login',
'ext': 'mp4',
'description': "Baldur's Gate: Original, Modded or Enhanced Edition? I'll break down what you can expect from the new Baldur's Gate: Enhanced Edition.",
'title': "Breaking Down Baldur's Gate",
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'duration': 264,
'uploader': 'The Escapist',
}
'ext': 'mp4',
'description': 'This week, Zero Punctuation reviews Evolve.',
'title': 'Evolve - One vs Multiplayer',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'duration': 304,
'uploader': 'The Escapist',
}
'ext': 'mp4',
'title': 'ArcGIS Online - Developing Applications',
'description': 'Jeremy Bartley demonstrates how to develop applications with ArcGIS Online.',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'duration': 185,
'upload_date': '20120419',
}
'ext': 'mp4',
'title': 'TRADE - Wikileaks on TTIP',
'description': 'NEW LIVE EC Midday press briefing of 11/08/2015',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'upload_date': '20150811',
'duration': 34,
'view_count': int,
'ext': 'mp4',
'title': 'NYX Butter Lipstick Little Susie',
'description': 'Goes on like butter, but looks better!',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'uploader': 'Stephanie S.',
'upload_date': '20150520',
'view_count': int,
AENetworksIE,
HistoryTopicIE,
)
-from .afreecatv import AfreecaTVIE
+from .afreecatv import (
+ AfreecaTVIE,
+ AfreecaTVGlobalIE,
+)
from .airmozilla import AirMozillaIE
from .aljazeera import AlJazeeraIE
from .alphaporno import AlphaPornoIE
from .animeondemand import AnimeOnDemandIE
from .anitube import AnitubeIE
from .anysex import AnySexIE
-from .aol import (
- AolIE,
- AolFeaturesIE,
-)
+from .aol import AolIE
from .allocine import AllocineIE
from .aparat import AparatIE
from .appleconnect import AppleConnectIE
AWAANLiveIE,
AWAANSeasonIE,
)
+from .azmedien import (
+ AZMedienIE,
+ AZMedienPlaylistIE,
+)
from .azubu import AzubuIE, AzubuLiveIE
from .baidu import BaiduVideoIE
from .bambuser import BambuserIE, BambuserChannelIE
BBCCoUkPlaylistIE,
BBCIE,
)
+from .beampro import BeamProLiveIE
from .beeg import BeegIE
from .behindkink import BehindKinkIE
from .bellmedia import BellMediaIE
from .bet import BetIE
from .bigflix import BigflixIE
from .bild import BildIE
-from .bilibili import BiliBiliIE
+from .bilibili import (
+ BiliBiliIE,
+ BiliBiliBangumiIE,
+)
from .biobiochiletv import BioBioChileTVIE
from .biqle import BIQLEIE
from .bleacherreport import (
)
from .cbssports import CBSSportsIE
from .ccc import CCCIE
+from .ccma import CCMAIE
from .cctv import CCTVIE
from .cda import CDAIE
from .ceskatelevize import CeskaTelevizeIE
from .dreisat import DreiSatIE
from .drbonanza import DRBonanzaIE
from .drtuber import DrTuberIE
-from .drtv import DRTVIE
+from .drtv import (
+ DRTVIE,
+ DRTVLiveIE,
+)
from .dvtv import DVTVIE
from .dumpert import DumpertIE
from .defense import DefenseGouvFrIE
from .discovery import DiscoveryIE
from .discoverygo import DiscoveryGoIE
+from .disney import DisneyIE
from .dispeak import DigitallySpeakingIE
from .dropbox import DropboxIE
from .dw import (
from .eagleplatform import EaglePlatformIE
from .ebaumsworld import EbaumsWorldIE
from .echomsk import EchoMskIE
+from .egghead import EggheadCourseIE
from .ehow import EHowIE
from .eighttracks import EightTracksIE
from .einthusan import EinthusanIE
FC2EmbedIE,
)
from .fczenit import FczenitIE
+from .filmon import (
+ FilmOnIE,
+ FilmOnChannelIE,
+)
from .firstpost import FirstpostIE
from .firsttv import FirstTVIE
from .fivemin import FiveMinIE
)
from .freesound import FreesoundIE
from .freespeech import FreespeechIE
-from .freevideo import FreeVideoIE
from .funimation import FunimationIE
from .funnyordie import FunnyOrDieIE
from .fusion import FusionIE
from .gamersyde import GamersydeIE
from .gamespot import GameSpotIE
from .gamestar import GameStarIE
+from .gaskrank import GaskrankIE
from .gazeta import GazetaIE
from .gdcvault import GDCVaultIE
from .generic import GenericIE
)
from .historicfilms import HistoricFilmsIE
from .hitbox import HitboxIE, HitboxLiveIE
+from .hitrecord import HitRecordIE
from .hornbunny import HornBunnyIE
from .hotnewhiphop import HotNewHipHopIE
from .hotstar import HotStarIE
ImgurAlbumIE,
)
from .ina import InaIE
+from .inc import IncIE
from .indavideo import (
IndavideoIE,
IndavideoEmbedIE,
from .iprima import IPrimaIE
from .iqiyi import IqiyiIE
from .ir90tv import Ir90TvIE
+from .itv import ITVIE
from .ivi import (
IviIE,
IviCompilationIE
KuwoMvIE,
)
from .la7 import LA7IE
-from .laola1tv import Laola1TvIE
+from .laola1tv import (
+ Laola1TvEmbedIE,
+ Laola1TvIE,
+)
from .lci import LCIIE
from .lcp import (
LcpPlayIE,
)
from .matchtv import MatchTVIE
from .mdr import MDRIE
+from .meipai import MeipaiIE
+from .melonvod import MelonVODIE
from .meta import METAIE
from .metacafe import MetacafeIE
from .metacritic import MetacriticIE
MTVVideoIE,
MTVServicesEmbeddedIE,
MTVDEIE,
+ MTV81IE,
)
from .muenchentv import MuenchenTVIE
from .musicplayon import MusicPlayOnIE
NextMediaIE,
NextMediaActionNewsIE,
AppleDailyIE,
+ NextTVIE,
)
from .nfb import NFBIE
from .nfl import NFLIE
NRKPlaylistIE,
NRKSkoleIE,
NRKTVIE,
+ NRKTVDirekteIE,
+ NRKTVEpisodesIE,
+ NRKTVSeriesIE,
)
from .ntvde import NTVDeIE
from .ntvru import NTVRuIE
from .odatv import OdaTVIE
from .odnoklassniki import OdnoklassnikiIE
from .oktoberfesttv import OktoberfestTVIE
+from .ondemandkorea import OnDemandKoreaIE
from .onet import (
OnetIE,
OnetChannelIE,
from .philharmoniedeparis import PhilharmonieDeParisIE
from .phoenix import PhoenixIE
from .photobucket import PhotobucketIE
+from .piksel import PikselIE
from .pinkbike import PinkbikeIE
from .pladform import PladformIE
from .playfm import PlayFMIE
)
from .porn91 import Porn91IE
from .porncom import PornComIE
+from .pornflip import PornFlipIE
from .pornhd import PornHdIE
from .pornhub import (
PornHubIE,
from .scivee import SciVeeIE
from .screencast import ScreencastIE
from .screencastomatic import ScreencastOMaticIE
-from .screenjunkies import ScreenJunkiesIE
from .seeker import SeekerIE
from .senateisvp import SenateISVPIE
from .sendtonews import SendtoNewsIE
SharedIE,
VivoIE,
)
-from .sharesix import ShareSixIE
+from .showroomlive import ShowRoomLiveIE
from .sina import SinaIE
from .sixplay import SixPlayIE
from .skynewsarabia import (
from .spike import SpikeIE
from .stitcher import StitcherIE
from .sport5 import Sport5IE
-from .sportbox import (
- SportBoxIE,
- SportBoxEmbedIE,
-)
+from .sportbox import SportBoxEmbedIE
from .sportdeutschland import SportDeutschlandIE
from .sportschau import SportschauIE
from .srgssr import (
)
from .tv3 import TV3IE
from .tv4 import TV4IE
+from .tva import TVAIE
from .tvanouvelles import (
TVANouvellesIE,
TVANouvellesArticleIE,
TwitchChapterIE,
TwitchVodIE,
TwitchProfileIE,
+ TwitchAllVideosIE,
+ TwitchUploadsIE,
TwitchPastBroadcastsIE,
+ TwitchHighlightsIE,
TwitchStreamIE,
TwitchClipsIE,
)
UdemyCourseIE
)
from .udn import UDNEmbedIE
+from .uktvplay import UKTVPlayIE
from .digiteka import DigitekaIE
from .unistra import UnistraIE
from .uol import UOLIE
from .viceland import VicelandIE
from .vidbit import VidbitIE
from .viddler import ViddlerIE
+from .videa import VideaIE
from .videodetective import VideoDetectiveIE
from .videofyme import VideofyMeIE
from .videomega import VideoMegaIE
VideomoreSeasonIE,
)
from .videopremium import VideoPremiumIE
-from .videott import VideoTtIE
+from .videopress import VideoPressIE
from .vidio import VidioIE
from .vidme import (
VidmeIE,
VikiIE,
VikiChannelIE,
)
+from .viu import (
+ ViuIE,
+ ViuPlaylistIE,
+ ViuOTTIE,
+)
from .vk import (
VKIE,
VKUserVideosIE,
VKWallPostIE,
)
-from .vlive import VLiveIE
+from .vlive import (
+ VLiveIE,
+ VLiveChannelIE
+)
from .vodlocker import VodlockerIE
from .vodplatform import VODPlatformIE
from .voicerepublic import VoiceRepublicIE
from .vrt import VRTIE
from .vube import VubeIE
from .vuclip import VuClipIE
+from .vvvvid import VVVVIDIE
from .vyborymos import VyboryMosIE
from .vzaar import VzaarIE
from .walla import WallaIE
compat_urllib_parse_unquote_plus,
)
from ..utils import (
+ clean_html,
error_to_compat_str,
ExtractorError,
+ get_element_by_id,
int_or_none,
+ js_to_json,
limit_length,
sanitized_Request,
+ try_get,
urlencode_postdata,
- get_element_by_id,
- clean_html,
)
_VALID_URL = r'''(?x)
(?:
https?://
- (?:[\w-]+\.)?facebook\.com/
+ (?:[\w-]+\.)?(?:facebook\.com|facebookcorewwwi\.onion)/
(?:[^#]*?\#!/)?
(?:
(?:
'info_dict': {
'id': '274175099429670',
'ext': 'mp4',
- 'title': 'Facebook video #274175099429670',
+ 'title': 'Asif Nawab Butt posted a video to his Timeline.',
'uploader': 'Asif Nawab Butt',
'upload_date': '20140506',
'timestamp': 1399398998,
}, {
'url': 'https://zh-hk.facebook.com/peoplespower/videos/1135894589806027/',
'only_matching': True,
+ }, {
+ 'url': 'https://www.facebookcorewwwi.onion/video.php?v=274175099429670',
+ 'only_matching': True,
}]
@staticmethod
video_data = None
+ def extract_video_data(instances):
+ for item in instances:
+ if item[1][0] == 'VideoConfig':
+ video_item = item[2][0]
+ if video_item.get('video_id') == video_id:
+ return video_item['videoData']
+
server_js_data = self._parse_json(self._search_regex(
- r'handleServerJS\(({.+})(?:\);|,")', webpage, 'server js data', default='{}'), video_id)
- for item in server_js_data.get('instances', []):
- if item[1][0] == 'VideoConfig':
- video_data = item[2][0]['videoData']
- break
+ r'handleServerJS\(({.+})(?:\);|,")', webpage,
+ 'server js data', default='{}'), video_id, fatal=False)
+
+ if server_js_data:
+ video_data = extract_video_data(server_js_data.get('instances', []))
+
+ if not video_data:
+ server_js_data = self._parse_json(
+ self._search_regex(
+ r'bigPipe\.onPageletArrive\(({.+?})\)\s*;\s*}\s*\)\s*,\s*["\']onPageletArrive\s+stream_pagelet',
+ webpage, 'js data', default='{}'),
+ video_id, transform_source=js_to_json, fatal=False)
+ if server_js_data:
+ video_data = extract_video_data(try_get(
+ server_js_data, lambda x: x['jsmods']['instances'],
+ list) or [])
if not video_data:
if not fatal_if_no_video:
raise ExtractorError(
'The video is not available, Facebook said: "%s"' % m_msg.group(1),
expected=True)
+ elif '>You must log in to continue' in webpage:
+ self.raise_login_required()
else:
raise ExtractorError('Cannot parse data')
video_title = self._html_search_regex(
r'(?s)<span class="fbPhotosPhotoCaption".*?id="fbPhotoPageCaption"><span class="hasCaption">(.*?)</span>',
webpage, 'alternative title', default=None)
- video_title = limit_length(video_title, 80)
if not video_title:
+ video_title = self._html_search_meta(
+ 'description', webpage, 'title')
+ if video_title:
+ video_title = limit_length(video_title, 80)
+ else:
video_title = 'Facebook video #%s' % video_id
- uploader = clean_html(get_element_by_id('fbPhotoPageAuthorName', webpage))
+ uploader = clean_html(get_element_by_id(
+ 'fbPhotoPageAuthorName', webpage)) or self._search_regex(
+ r'ownerName\s*:\s*"([^"]+)"', webpage, 'uploader', fatal=False)
timestamp = int_or_none(self._search_regex(
r'<abbr[^>]+data-utime=["\'](\d+)', webpage,
'timestamp', default=None))
'id': '201403223kCqB3Ez',
'ext': 'flv',
'title': 'プリズン・ブレイク S1-01 マイケル 【吹替】',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
},
}
--- /dev/null
+# coding: utf-8
+from __future__ import unicode_literals
+
+from .common import InfoExtractor
+from ..compat import (
+ compat_str,
+ compat_HTTPError,
+)
+from ..utils import (
+ qualities,
+ strip_or_none,
+ int_or_none,
+ ExtractorError,
+)
+
+
+class FilmOnIE(InfoExtractor):
+ IE_NAME = 'filmon'
+ _VALID_URL = r'(?:https?://(?:www\.)?filmon\.com/vod/view/|filmon:)(?P<id>\d+)'
+ _TESTS = [{
+ 'url': 'https://www.filmon.com/vod/view/24869-0-plan-9-from-outer-space',
+ 'info_dict': {
+ 'id': '24869',
+ 'ext': 'mp4',
+ 'title': 'Plan 9 From Outer Space',
+ 'description': 'Dead human, zombies and vampires',
+ },
+ }, {
+ 'url': 'https://www.filmon.com/vod/view/2825-1-popeye-series-1',
+ 'info_dict': {
+ 'id': '2825',
+ 'title': 'Popeye Series 1',
+ 'description': 'The original series of Popeye.',
+ },
+ 'playlist_mincount': 8,
+ }]
+
+ def _real_extract(self, url):
+ video_id = self._match_id(url)
+
+ try:
+ response = self._download_json(
+ 'https://www.filmon.com/api/vod/movie?id=%s' % video_id,
+ video_id)['response']
+ except ExtractorError as e:
+ if isinstance(e.cause, compat_HTTPError):
+ errmsg = self._parse_json(e.cause.read().decode(), video_id)['reason']
+ raise ExtractorError('%s said: %s' % (self.IE_NAME, errmsg), expected=True)
+ raise
+
+ title = response['title']
+ description = strip_or_none(response.get('description'))
+
+ if response.get('type_id') == 1:
+ entries = [self.url_result('filmon:' + episode_id) for episode_id in response.get('episodes', [])]
+ return self.playlist_result(entries, video_id, title, description)
+
+ QUALITY = qualities(('low', 'high'))
+ formats = []
+ for format_id, stream in response.get('streams', {}).items():
+ stream_url = stream.get('url')
+ if not stream_url:
+ continue
+ formats.append({
+ 'format_id': format_id,
+ 'url': stream_url,
+ 'ext': 'mp4',
+ 'quality': QUALITY(stream.get('quality')),
+ 'protocol': 'm3u8_native',
+ })
+ self._sort_formats(formats)
+
+ thumbnails = []
+ poster = response.get('poster', {})
+ thumbs = poster.get('thumbs', {})
+ thumbs['poster'] = poster
+ for thumb_id, thumb in thumbs.items():
+ thumb_url = thumb.get('url')
+ if not thumb_url:
+ continue
+ thumbnails.append({
+ 'id': thumb_id,
+ 'url': thumb_url,
+ 'width': int_or_none(thumb.get('width')),
+ 'height': int_or_none(thumb.get('height')),
+ })
+
+ return {
+ 'id': video_id,
+ 'title': title,
+ 'formats': formats,
+ 'description': description,
+ 'thumbnails': thumbnails,
+ }
+
+
+class FilmOnChannelIE(InfoExtractor):
+ IE_NAME = 'filmon:channel'
+ _VALID_URL = r'https?://(?:www\.)?filmon\.com/(?:tv|channel)/(?P<id>[a-z0-9-]+)'
+ _TESTS = [{
+ # VOD
+ 'url': 'http://www.filmon.com/tv/sports-haters',
+ 'info_dict': {
+ 'id': '4190',
+ 'ext': 'mp4',
+ 'title': 'Sports Haters',
+ 'description': 'md5:dabcb4c1d9cfc77085612f1a85f8275d',
+ },
+ }, {
+ # LIVE
+ 'url': 'https://www.filmon.com/channel/filmon-sports',
+ 'only_matching': True,
+ }, {
+ 'url': 'https://www.filmon.com/tv/2894',
+ 'only_matching': True,
+ }]
+
+ _THUMBNAIL_RES = [
+ ('logo', 56, 28),
+ ('big_logo', 106, 106),
+ ('extra_big_logo', 300, 300),
+ ]
+
+ def _real_extract(self, url):
+ channel_id = self._match_id(url)
+
+ try:
+ channel_data = self._download_json(
+ 'http://www.filmon.com/api-v2/channel/' + channel_id, channel_id)['data']
+ except ExtractorError as e:
+ if isinstance(e.cause, compat_HTTPError):
+ errmsg = self._parse_json(e.cause.read().decode(), channel_id)['message']
+ raise ExtractorError('%s said: %s' % (self.IE_NAME, errmsg), expected=True)
+ raise
+
+ channel_id = compat_str(channel_data['id'])
+ is_live = not channel_data.get('is_vod') and not channel_data.get('is_vox')
+ title = channel_data['title']
+
+ QUALITY = qualities(('low', 'high'))
+ formats = []
+ for stream in channel_data.get('streams', []):
+ stream_url = stream.get('url')
+ if not stream_url:
+ continue
+ if not is_live:
+ formats.extend(self._extract_wowza_formats(
+ stream_url, channel_id, skip_protocols=['dash', 'rtmp', 'rtsp']))
+ continue
+ quality = stream.get('quality')
+ formats.append({
+ 'format_id': quality,
+ # this is an m3u8 stream, but we are deliberately not using _extract_m3u8_formats
+ # because it doesn't have bitrate variants anyway
+ 'url': stream_url,
+ 'ext': 'mp4',
+ 'quality': QUALITY(quality),
+ })
+ self._sort_formats(formats)
+
+ thumbnails = []
+ for name, width, height in self._THUMBNAIL_RES:
+ thumbnails.append({
+ 'id': name,
+ 'url': 'http://static.filmon.com/assets/channels/%s/%s.png' % (channel_id, name),
+ 'width': width,
+ 'height': height,
+ })
+
+ return {
+ 'id': channel_id,
+ 'display_id': channel_data.get('alias'),
+ 'title': self._live_title(title) if is_live else title,
+ 'description': channel_data.get('description'),
+ 'thumbnails': thumbnails,
+ 'formats': formats,
+ 'is_live': is_live,
+ }
from __future__ import unicode_literals
from .common import InfoExtractor
-from ..compat import compat_urlparse
+from ..compat import (
+ compat_str,
+ compat_urlparse,
+)
from ..utils import (
int_or_none,
qualities,
'info_dict': {
'id': '40049',
'ext': 'mp4',
- 'title': 'Гость Людмила Сенчина. Наедине со всеми. Выпуск от 12.02.2015',
- 'description': 'md5:36a39c1d19618fec57d12efe212a8370',
- 'thumbnail': 're:^https?://.*\.(?:jpg|JPG)$',
+ 'title': 'Гость Людмила Сенчина. Наедине со всеми. Выпуск от 12.02.2015',
+ 'thumbnail': r're:^https?://.*\.(?:jpg|JPG)$',
'upload_date': '20150212',
'duration': 2694,
},
'info_dict': {
'id': '364746',
'ext': 'mp4',
- 'title': 'Весенняя аллергия. Доброе утро. Фрагмент выпуска от 07.04.2016',
- 'description': 'md5:a242eea0031fd180a4497d52640a9572',
- 'thumbnail': 're:^https?://.*\.(?:jpg|JPG)$',
+ 'title': 'Весенняя аллергия. Доброе утро. Фрагмент выпуска от 07.04.2016',
+ 'thumbnail': r're:^https?://.*\.(?:jpg|JPG)$',
'upload_date': '20160407',
'duration': 179,
'formats': 'mincount:3',
'params': {
'skip_download': True,
},
+ }, {
+ 'url': 'http://www.1tv.ru/news/issue/2016-12-01/14:00',
+ 'info_dict': {
+ 'id': '14:00',
+ 'title': 'Выпуск новостей в 14:00 1 декабря 2016 года. Новости. Первый канал',
+ 'description': 'md5:2e921b948f8c1ff93901da78ebdb1dfd',
+ },
+ 'playlist_count': 13,
+ }, {
+ 'url': 'http://www.1tv.ru/shows/tochvtoch-supersezon/vystupleniya/evgeniy-dyatlov-vladimir-vysockiy-koni-priveredlivye-toch-v-toch-supersezon-fragment-vypuska-ot-06-11-2016',
+ 'only_matching': True,
}]
def _real_extract(self, url):
webpage = self._download_webpage(url, display_id)
playlist_url = compat_urlparse.urljoin(url, self._search_regex(
- r'data-playlist-url="([^"]+)', webpage, 'playlist url'))
+ r'data-playlist-url=(["\'])(?P<url>(?:(?!\1).)+)\1',
+ webpage, 'playlist url', group='url'))
+
+ parsed_url = compat_urlparse.urlparse(playlist_url)
+ qs = compat_urlparse.parse_qs(parsed_url.query)
+ item_ids = qs.get('videos_ids[]') or qs.get('news_ids[]')
+
+ items = self._download_json(playlist_url, display_id)
+
+ if item_ids:
+ items = [
+ item for item in items
+ if item.get('uid') and compat_str(item['uid']) in item_ids]
+ else:
+ items = [items[0]]
+
+ entries = []
+ QUALITIES = ('ld', 'sd', 'hd', )
+
+ for item in items:
+ title = item['title']
+ quality = qualities(QUALITIES)
+ formats = []
+ path = None
+ for f in item.get('mbr', []):
+ src = f.get('src')
+ if not src or not isinstance(src, compat_str):
+ continue
+ tbr = int_or_none(self._search_regex(
+ r'_(\d{3,})\.mp4', src, 'tbr', default=None))
+ if not path:
+ path = self._search_regex(
+ r'//[^/]+/(.+?)_\d+\.mp4', src,
+ 'm3u8 path', default=None)
+ formats.append({
+ 'url': src,
+ 'format_id': f.get('name'),
+ 'tbr': tbr,
+ 'source_preference': quality(f.get('name')),
+ })
+ # m3u8 URL format is reverse engineered from [1] (search for
+ # master.m3u8). dashEdges (that is currently balancer-vod.1tv.ru)
+ # is taken from [2].
+ # 1. http://static.1tv.ru/player/eump1tv-current/eump-1tv.all.min.js?rnd=9097422834:formatted
+ # 2. http://static.1tv.ru/player/eump1tv-config/config-main.js?rnd=9097422834
+ if not path and len(formats) == 1:
+ path = self._search_regex(
+ r'//[^/]+/(.+?$)', formats[0]['url'],
+ 'm3u8 path', default=None)
+ if path:
+ if len(formats) == 1:
+ m3u8_path = ','
+ else:
+ tbrs = [compat_str(t) for t in sorted(f['tbr'] for f in formats)]
+ m3u8_path = '_,%s,%s' % (','.join(tbrs), '.mp4')
+ formats.extend(self._extract_m3u8_formats(
+ 'http://balancer-vod.1tv.ru/%s%s.urlset/master.m3u8'
+ % (path, m3u8_path),
+ display_id, 'mp4',
+ entry_protocol='m3u8_native', m3u8_id='hls', fatal=False))
+ self._sort_formats(formats)
+
+ thumbnail = item.get('poster') or self._og_search_thumbnail(webpage)
+ duration = int_or_none(item.get('duration') or self._html_search_meta(
+ 'video:duration', webpage, 'video duration', fatal=False))
+ upload_date = unified_strdate(self._html_search_meta(
+ 'ya:ovs:upload_date', webpage, 'upload date', default=None))
- item = self._download_json(playlist_url, display_id)[0]
- video_id = item['id']
- quality = qualities(('ld', 'sd', 'hd', ))
- formats = []
- for f in item.get('mbr', []):
- src = f.get('src')
- if not src:
- continue
- fname = f.get('name')
- formats.append({
- 'url': src,
- 'format_id': fname,
- 'quality': quality(fname),
+ entries.append({
+ 'id': compat_str(item.get('id') or item['uid']),
+ 'thumbnail': thumbnail,
+ 'title': title,
+ 'upload_date': upload_date,
+ 'duration': int_or_none(duration),
+ 'formats': formats
})
- self._sort_formats(formats)
title = self._html_search_regex(
(r'<div class="tv_translation">\s*<h1><a href="[^"]+">([^<]*)</a>',
r"'title'\s*:\s*'([^']+)'"),
- webpage, 'title', default=None) or item['title']
+ webpage, 'title', default=None) or self._og_search_title(
+ webpage, default=None)
description = self._html_search_regex(
r'<div class="descr">\s*<div> </div>\s*<p>([^<]*)</p></div>',
webpage, 'description', default=None) or self._html_search_meta(
- 'description', webpage, 'description')
- duration = int_or_none(self._html_search_meta(
- 'video:duration', webpage, 'video duration', fatal=False))
- upload_date = unified_strdate(self._html_search_meta(
- 'ya:ovs:upload_date', webpage, 'upload date', fatal=False))
+ 'description', webpage, 'description', default=None)
- return {
- 'id': video_id,
- 'thumbnail': item.get('poster') or self._og_search_thumbnail(webpage),
- 'title': title,
- 'description': description,
- 'upload_date': upload_date,
- 'duration': int_or_none(duration),
- 'formats': formats
- }
+ return self.playlist_result(entries, display_id, title, description)
'ext': 'mp4',
'title': 'Россияне выбрали имя для общенациональной платежной системы',
'description': 'md5:a8aa13e2b7ad36789e9f77a74b6de660',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'duration': 180,
},
}, {
'ext': 'mp4',
'title': '3D принтер',
'description': 'md5:d76c736d29ef7ec5c0cf7d7c65ffcb41',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'duration': 180,
},
}, {
'id': 'glavnoe',
'ext': 'mp4',
'title': 'Итоги недели с 8 по 14 июня 2015 года',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
},
}, {
'url': 'http://www.5-tv.ru/glavnoe/broadcasts/508645/',
'id': '1',
'ext': 'mp4',
'title': 'Folge 1 vom 10. April 2007',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
},
}
'filesize': int_or_none(cover.get('size')),
} for cover in flipagram.get('covers', []) if cover.get('url')]
- # Note that this only retrieves comments that are initally loaded.
+ # Note that this only retrieves comments that are initially loaded.
# For videos with large amounts of comments, most won't be retrieved.
comments = []
for comment in video_data.get('comments', {}).get(video_id, {}).get('items', []):
'title': 'Fuck Turkish-style',
'description': 'md5:6ae2d9486921891efe89231ace13ffdf',
'age_limit': 18,
- 'thumbnail': 're:https?://.*\.jpg$',
+ 'thumbnail': r're:https?://.*\.jpg$',
},
}
'duration': 265,
'timestamp': 1304411491,
'upload_date': '20110503',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
},
},
{
'duration': 292,
'timestamp': 1417662047,
'upload_date': '20141204',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
},
'params': {
# m3u8 download
'description': 'Is campus censorship getting out of control?',
'timestamp': 1472168725,
'upload_date': '20160825',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
},
'params': {
# m3u8 download
'display_id': 'rendez-vous-au-pays-des-geeks',
'ext': 'mp3',
'title': 'Rendez-vous au pays des geeks',
- 'thumbnail': 're:^https?://.*\\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'upload_date': '20140301',
'vcodec': 'none',
}
'id': 'NI_173343',
'ext': 'mp4',
'title': 'Les entreprises familiales : le secret de la réussite',
- 'thumbnail': 're:^https?://.*\.jpe?g$',
+ 'thumbnail': r're:^https?://.*\.jpe?g$',
'timestamp': 1433273139,
'upload_date': '20150602',
},
'ext': 'mp4',
'title': 'Olivier Monthus, réalisateur de "Bretagne, le choix de l’Armor"',
'description': 'md5:a3264114c9d29aeca11ced113c37b16c',
- 'thumbnail': 're:^https?://.*\.jpe?g$',
+ 'thumbnail': r're:^https?://.*\.jpe?g$',
'timestamp': 1458300695,
'upload_date': '20160318',
},
import re
from .common import InfoExtractor
+from ..utils import (
+ float_or_none,
+ get_element_by_class,
+ get_element_by_id,
+ unified_strdate,
+)
class FreesoundIE(InfoExtractor):
- _VALID_URL = r'https?://(?:www\.)?freesound\.org/people/([^/]+)/sounds/(?P<id>[^/]+)'
+ _VALID_URL = r'https?://(?:www\.)?freesound\.org/people/[^/]+/sounds/(?P<id>[^/]+)'
_TEST = {
'url': 'http://www.freesound.org/people/miklovan/sounds/194503/',
'md5': '12280ceb42c81f19a515c745eae07650',
'id': '194503',
'ext': 'mp3',
'title': 'gulls in the city.wav',
- 'uploader': 'miklovan',
'description': 'the sounds of seagulls in the city',
+ 'duration': 130.233,
+ 'uploader': 'miklovan',
+ 'upload_date': '20130715',
+ 'tags': list,
}
}
def _real_extract(self, url):
- mobj = re.match(self._VALID_URL, url)
- music_id = mobj.group('id')
- webpage = self._download_webpage(url, music_id)
- title = self._html_search_regex(
- r'<div id="single_sample_header">.*?<a href="#">(.+?)</a>',
- webpage, 'music title', flags=re.DOTALL)
+ audio_id = self._match_id(url)
+
+ webpage = self._download_webpage(url, audio_id)
+
+ audio_url = self._og_search_property('audio', webpage, 'song url')
+ title = self._og_search_property('audio:title', webpage, 'song title')
+
description = self._html_search_regex(
- r'<div id="sound_description">(.*?)</div>', webpage, 'description',
- fatal=False, flags=re.DOTALL)
+ r'(?s)id=["\']sound_description["\'][^>]*>(.+?)</div>',
+ webpage, 'description', fatal=False)
+
+ duration = float_or_none(
+ get_element_by_class('duration', webpage), scale=1000)
+
+ upload_date = unified_strdate(get_element_by_id('sound_date', webpage))
+ uploader = self._og_search_property(
+ 'audio:artist', webpage, 'uploader', fatal=False)
+
+ channels = self._html_search_regex(
+ r'Channels</dt><dd>(.+?)</dd>', webpage,
+ 'channels info', fatal=False)
+
+ tags_str = get_element_by_class('tags', webpage)
+ tags = re.findall(r'<a[^>]+>([^<]+)', tags_str) if tags_str else None
+
+ audio_urls = [audio_url]
+
+ LQ_FORMAT = '-lq.mp3'
+ if LQ_FORMAT in audio_url:
+ audio_urls.append(audio_url.replace(LQ_FORMAT, '-hq.mp3'))
+
+ formats = [{
+ 'url': format_url,
+ 'format_note': channels,
+ 'quality': quality,
+ } for quality, format_url in enumerate(audio_urls)]
+ self._sort_formats(formats)
return {
- 'id': music_id,
+ 'id': audio_id,
'title': title,
- 'url': self._og_search_property('audio', webpage, 'music url'),
- 'uploader': self._og_search_property('audio:artist', webpage, 'music uploader'),
'description': description,
+ 'duration': duration,
+ 'uploader': uploader,
+ 'upload_date': upload_date,
+ 'tags': tags,
+ 'formats': formats,
}
+++ /dev/null
-from __future__ import unicode_literals
-
-from .common import InfoExtractor
-from ..utils import ExtractorError
-
-
-class FreeVideoIE(InfoExtractor):
- _VALID_URL = r'^https?://www.freevideo.cz/vase-videa/(?P<id>[^.]+)\.html(?:$|[?#])'
-
- _TEST = {
- 'url': 'http://www.freevideo.cz/vase-videa/vysukany-zadecek-22033.html',
- 'info_dict': {
- 'id': 'vysukany-zadecek-22033',
- 'ext': 'mp4',
- 'title': 'vysukany-zadecek-22033',
- 'age_limit': 18,
- },
- 'skip': 'Blocked outside .cz',
- }
-
- def _real_extract(self, url):
- video_id = self._match_id(url)
- webpage, handle = self._download_webpage_handle(url, video_id)
- if '//www.czechav.com/' in handle.geturl():
- raise ExtractorError(
- 'Access to freevideo is blocked from your location',
- expected=True)
-
- video_url = self._search_regex(
- r'\s+url: "(http://[a-z0-9-]+.cdn.freevideo.cz/stream/.*?/video.mp4)"',
- webpage, 'video URL')
-
- return {
- 'id': video_id,
- 'url': video_url,
- 'title': video_id,
- 'age_limit': 18,
- }
'ext': 'mp4',
'title': 'Air - 1 - Breeze',
'description': 'md5:1769f43cd5fc130ace8fd87232207892',
- 'thumbnail': 're:https?://.*\.jpg',
+ 'thumbnail': r're:https?://.*\.jpg',
},
'skip': 'Access without user interaction is forbidden by CloudFlare, and video removed',
}, {
'ext': 'mp4',
'title': '.hack//SIGN - 1 - Role Play',
'description': 'md5:b602bdc15eef4c9bbb201bb6e6a4a2dd',
- 'thumbnail': 're:https?://.*\.jpg',
+ 'thumbnail': r're:https?://.*\.jpg',
},
'skip': 'Access without user interaction is forbidden by CloudFlare',
}, {
'ext': 'mp4',
'title': 'Attack on Titan: Junior High - Broadcast Dub Preview',
'description': 'md5:f8ec49c0aff702a7832cd81b8a44f803',
- 'thumbnail': 're:https?://.*\.(?:jpg|png)',
+ 'thumbnail': r're:https?://.*\.(?:jpg|png)',
},
'skip': 'Access without user interaction is forbidden by CloudFlare',
}]
'ext': 'mp4',
'title': 'Heart-Shaped Box: Literal Video Version',
'description': 'md5:ea09a01bc9a1c46d9ab696c01747c338',
- 'thumbnail': 're:^http:.*\.jpg$',
+ 'thumbnail': r're:^http:.*\.jpg$',
},
}, {
'url': 'http://www.funnyordie.com/embed/e402820827',
'ext': 'mp4',
'title': 'Please Use This Song (Jon Lajoie)',
'description': 'Please use this to sell something. www.jonlajoie.com',
- 'thumbnail': 're:^http:.*\.jpg$',
+ 'thumbnail': r're:^http:.*\.jpg$',
},
'params': {
'skip_download': True,
webpage = self._download_webpage(url, display_id)
ooyala_code = self._search_regex(
- r'data-video-id=(["\'])(?P<code>.+?)\1',
+ r'data-ooyala-id=(["\'])(?P<code>(?:(?!\1).)+)\1',
webpage, 'ooyala code', group='code')
return OoyalaIE._build_url_result(ooyala_code)
'ext': 'mp4',
'duration': 372,
'title': 'Bloodborne - Birth of a hero',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
}
}
streams, ('progressive_hd', 'progressive_high', 'progressive_low'))
if progressive_url and manifest_url:
qualities_basename = self._search_regex(
- '/([^/]+)\.csmil/',
+ r'/([^/]+)\.csmil/',
manifest_url, 'qualities basename', default=None)
if qualities_basename:
QUALITIES_RE = r'((,\d+)+,?)'
'ext': 'mp4',
'title': 'Hobbit 3: Die Schlacht der Fünf Heere - Teaser-Trailer zum dritten Teil',
'description': 'Der Teaser-Trailer zu Hobbit 3: Die Schlacht der Fünf Heere zeigt einige Szenen aus dem dritten Teil der Saga und kündigt den...',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'timestamp': 1406542020,
'upload_date': '20140728',
'duration': 17
--- /dev/null
+# coding: utf-8
+from __future__ import unicode_literals
+
+import re
+from .common import InfoExtractor
+from ..utils import (
+ float_or_none,
+ int_or_none,
+ js_to_json,
+ unified_strdate,
+)
+
+
+class GaskrankIE(InfoExtractor):
+ """InfoExtractor for gaskrank.tv"""
+ _VALID_URL = r'https?://(?:www\.)?gaskrank\.tv/tv/(?P<categories>[^/]+)/(?P<id>[^/]+)\.html?'
+ _TESTS = [
+ {
+ 'url': 'http://www.gaskrank.tv/tv/motorrad-fun/strike-einparken-durch-anfaenger-crash-mit-groesserem-flurschaden.htm',
+ 'md5': '1ae88dbac97887d85ebd1157a95fc4f9',
+ 'info_dict': {
+ 'id': '201601/26955',
+ 'ext': 'mp4',
+ 'title': 'Strike! Einparken können nur Männer - Flurschaden hält sich in Grenzen *lol*',
+ 'thumbnail': r're:^https?://.*\.jpg$',
+ 'categories': ['motorrad-fun'],
+ 'display_id': 'strike-einparken-durch-anfaenger-crash-mit-groesserem-flurschaden',
+ 'uploader_id': 'Bikefun',
+ 'upload_date': '20170110',
+ 'uploader_url': None,
+ }
+ },
+ {
+ 'url': 'http://www.gaskrank.tv/tv/racing/isle-of-man-tt-2011-michael-du-15920.htm',
+ 'md5': 'c33ee32c711bc6c8224bfcbe62b23095',
+ 'info_dict': {
+ 'id': '201106/15920',
+ 'ext': 'mp4',
+ 'title': 'Isle of Man - Michael Dunlop vs Guy Martin - schwindelig kucken',
+ 'thumbnail': r're:^https?://.*\.jpg$',
+ 'categories': ['racing'],
+ 'display_id': 'isle-of-man-tt-2011-michael-du-15920',
+ 'uploader_id': 'IOM',
+ 'upload_date': '20160506',
+ 'uploader_url': 'www.iomtt.com',
+ }
+ }
+ ]
+
+ def _real_extract(self, url):
+ """extract information from gaskrank.tv"""
+ def fix_json(code):
+ """Removes trailing comma in json: {{},} --> {{}}"""
+ return re.sub(r',\s*}', r'}', js_to_json(code))
+
+ display_id = self._match_id(url)
+ webpage = self._download_webpage(url, display_id)
+ categories = [re.match(self._VALID_URL, url).group('categories')]
+ title = self._search_regex(
+ r'movieName\s*:\s*\'([^\']*)\'',
+ webpage, 'title')
+ thumbnail = self._search_regex(
+ r'poster\s*:\s*\'([^\']*)\'',
+ webpage, 'thumbnail', default=None)
+
+ mobj = re.search(
+ r'Video von:\s*(?P<uploader_id>[^|]*?)\s*\|\s*vom:\s*(?P<upload_date>[0-9][0-9]\.[0-9][0-9]\.[0-9][0-9][0-9][0-9])',
+ webpage)
+ if mobj is not None:
+ uploader_id = mobj.groupdict().get('uploader_id')
+ upload_date = unified_strdate(mobj.groupdict().get('upload_date'))
+
+ uploader_url = self._search_regex(
+ r'Homepage:\s*<[^>]*>(?P<uploader_url>[^<]*)',
+ webpage, 'uploader_url', default=None)
+ tags = re.findall(
+ r'/tv/tags/[^/]+/"\s*>(?P<tag>[^<]*?)<',
+ webpage)
+
+ view_count = self._search_regex(
+ r'class\s*=\s*"gkRight"(?:[^>]*>\s*<[^>]*)*icon-eye-open(?:[^>]*>\s*<[^>]*)*>\s*(?P<view_count>[0-9\.]*)',
+ webpage, 'view_count', default=None)
+ if view_count:
+ view_count = int_or_none(view_count.replace('.', ''))
+
+ average_rating = self._search_regex(
+ r'itemprop\s*=\s*"ratingValue"[^>]*>\s*(?P<average_rating>[0-9,]+)',
+ webpage, 'average_rating')
+ if average_rating:
+ average_rating = float_or_none(average_rating.replace(',', '.'))
+
+ playlist = self._parse_json(
+ self._search_regex(
+ r'playlist\s*:\s*\[([^\]]*)\]',
+ webpage, 'playlist', default='{}'),
+ display_id, transform_source=fix_json, fatal=False)
+
+ video_id = self._search_regex(
+ r'https?://movies\.gaskrank\.tv/([^-]*?)(-[^\.]*)?\.mp4',
+ playlist.get('0').get('src'), 'video id')
+
+ formats = []
+ for key in playlist:
+ formats.append({
+ 'url': playlist[key]['src'],
+ 'format_id': key,
+ 'quality': playlist[key].get('quality')})
+ self._sort_formats(formats, field_preference=['format_id'])
+
+ return {
+ 'id': video_id,
+ 'title': title,
+ 'formats': formats,
+ 'thumbnail': thumbnail,
+ 'categories': categories,
+ 'display_id': display_id,
+ 'uploader_id': uploader_id,
+ 'upload_date': upload_date,
+ 'uploader_url': uploader_url,
+ 'tags': tags,
+ 'view_count': view_count,
+ 'average_rating': average_rating,
+ }
'ext': 'mp4',
'title': '«70–80 процентов гражданских в Донецке на грани голода»',
'description': 'md5:38617526050bd17b234728e7f9620a71',
- 'thumbnail': 're:^https?://.*\.jpg',
+ 'thumbnail': r're:^https?://.*\.jpg',
},
'skip': 'video not found',
}, {
UnsupportedError,
xpath_text,
)
+from .commonprotocols import RtmpIE
from .brightcove import (
BrightcoveLegacyIE,
BrightcoveNewIE,
from .eagleplatform import EaglePlatformIE
from .facebook import FacebookIE
from .soundcloud import SoundcloudIE
+from .tunein import TuneInBaseIE
from .vbox7 import Vbox7IE
from .dbtv import DBTVIE
+from .piksel import PikselIE
+from .videa import VideaIE
+from .twentymin import TwentyMinutenIE
+from .ustream import UstreamIE
+from .openload import OpenloadIE
+from .videopress import VideoPressIE
class GenericIE(InfoExtractor):
'ext': 'mp4',
'title': 'Tikibad ontruimd wegens brand',
'description': 'md5:05ca046ff47b931f9b04855015e163a4',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'duration': 33,
},
'params': {
'ext': 'mp4',
'upload_date': '20130224',
'uploader_id': 'TheVerge',
- 'description': 're:^Chris Ziegler takes a look at the\.*',
+ 'description': r're:^Chris Ziegler takes a look at the\.*',
'uploader': 'The Verge',
'title': 'First Firefox OS phones side-by-side',
},
},
'skip': 'There is a limit of 200 free downloads / month for the test song',
},
- # embedded brightcove video
- # it also tests brightcove videos that need to set the 'Referer' in the
- # http requests
{
+ # embedded brightcove video
+ # it also tests brightcove videos that need to set the 'Referer'
+ # in the http requests
'add_ie': ['BrightcoveLegacy'],
'url': 'http://www.bfmtv.com/video/bfmbusiness/cours-bourse/cours-bourse-l-analyse-technique-154522/',
'info_dict': {
'skip_download': True,
},
},
+ {
+ # embedded with itemprop embedURL and video id spelled as `idVideo`
+ 'add_id': ['BrightcoveLegacy'],
+ 'url': 'http://bfmbusiness.bfmtv.com/mediaplayer/chroniques/olivier-delamarche/',
+ 'info_dict': {
+ 'id': '5255628253001',
+ 'ext': 'mp4',
+ 'title': 'md5:37c519b1128915607601e75a87995fc0',
+ 'description': 'md5:37f7f888b434bb8f8cc8dbd4f7a4cf26',
+ 'uploader': 'BFM BUSINESS',
+ 'uploader_id': '876450612001',
+ 'timestamp': 1482255315,
+ 'upload_date': '20161220',
+ },
+ 'params': {
+ 'skip_download': True,
+ },
+ },
{
# https://github.com/rg3/youtube-dl/issues/2253
'url': 'http://bcove.me/i6nfkrc3',
'skip_download': True, # m3u8 download
},
},
+ {
+ # Brightcove with alternative playerID key
+ 'url': 'http://www.nature.com/nmeth/journal/v9/n7/fig_tab/nmeth.2062_SV1.html',
+ 'info_dict': {
+ 'id': 'nmeth.2062_SV1',
+ 'title': 'Simultaneous multiview imaging of the Drosophila syncytial blastoderm : Quantitative high-speed imaging of entire developing embryos with simultaneous multiview light-sheet microscopy : Nature Methods : Nature Research',
+ },
+ 'playlist': [{
+ 'info_dict': {
+ 'id': '2228375078001',
+ 'ext': 'mp4',
+ 'title': 'nmeth.2062-sv1',
+ 'description': 'nmeth.2062-sv1',
+ 'timestamp': 1363357591,
+ 'upload_date': '20130315',
+ 'uploader': 'Nature Publishing Group',
+ 'uploader_id': '1964492299001',
+ },
+ }],
+ },
# ooyala video
{
'url': 'http://www.rollingstone.com/music/videos/norwegian-dj-cashmere-cat-goes-spartan-on-with-me-premiere-20131219',
'id': 'f4dafcad-ff21-423d-89b5-146cfd89fa1e',
'ext': 'mp4',
'title': 'Ужастики, русский трейлер (2015)',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'duration': 153,
}
},
'description': 'md5:8145d19d320ff3e52f28401f4c4283b9',
}
},
- # Embedded Ustream video
- {
- 'url': 'http://www.american.edu/spa/pti/nsa-privacy-janus-2014.cfm',
- 'md5': '27b99cdb639c9b12a79bca876a073417',
- 'info_dict': {
- 'id': '45734260',
- 'ext': 'flv',
- 'uploader': 'AU SPA: The NSA and Privacy',
- 'title': 'NSA and Privacy Forum Debate featuring General Hayden and Barton Gellman'
- }
- },
# nowvideo embed hidden behind percent encoding
{
'url': 'http://www.waoanime.tv/the-super-dimension-fortress-macross-episode-1/',
'duration': 48,
'timestamp': 1401537900,
'upload_date': '20140531',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
},
},
# Wistia embed
},
'playlist_mincount': 7,
},
+ # TuneIn station embed
+ {
+ 'url': 'http://radiocnrv.com/promouvoir-radio-cnrv/',
+ 'info_dict': {
+ 'id': '204146',
+ 'ext': 'mp3',
+ 'title': 'CNRV',
+ 'location': 'Paris, France',
+ 'is_live': True,
+ },
+ 'params': {
+ # Live stream
+ 'skip_download': True,
+ },
+ },
# Livestream embed
{
'url': 'http://www.esa.int/Our_Activities/Space_Science/Rosetta/Philae_comet_touch-down_webcast',
'title': 'Webinar: Using Discovery, The National Archives’ online catalogue',
},
},
+ # jwplayer rtmp
+ {
+ 'url': 'http://www.suffolk.edu/sjc/',
+ 'info_dict': {
+ 'id': 'sjclive',
+ 'ext': 'flv',
+ 'title': 'Massachusetts Supreme Judicial Court Oral Arguments',
+ 'uploader': 'www.suffolk.edu',
+ },
+ 'params': {
+ 'skip_download': True,
+ }
+ },
# rtl.nl embed
{
'url': 'http://www.rtlnieuws.nl/nieuws/buitenland/aanslagen-kopenhagen',
'skip_download': True,
}
},
+ {
+ # Kaltura embedded, some fileExt broken (#11480)
+ 'url': 'http://www.cornell.edu/video/nima-arkani-hamed-standard-models-of-particle-physics',
+ 'info_dict': {
+ 'id': '1_sgtvehim',
+ 'ext': 'mp4',
+ 'title': 'Our "Standard Models" of particle physics and cosmology',
+ 'description': 'md5:67ea74807b8c4fea92a6f38d6d323861',
+ 'timestamp': 1321158993,
+ 'upload_date': '20111113',
+ 'uploader_id': 'kps1',
+ },
+ 'add_ie': ['Kaltura'],
+ },
# Eagle.Platform embed (generic URL)
{
'url': 'http://lenta.ru/news/2015/03/06/navalny/',
'ext': 'mp4',
'title': 'Навальный вышел на свободу',
'description': 'md5:d97861ac9ae77377f3f20eaf9d04b4f5',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'duration': 87,
'view_count': int,
'age_limit': 0,
'id': '12820',
'ext': 'mp4',
'title': "'O Sole Mio",
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'duration': 216,
'view_count': int,
},
'ext': 'mp4',
'title': 'Тайны перевала Дятлова • 1 серия 2 часть',
'description': 'Документальный сериал-расследование одной из самых жутких тайн ХХ века',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'duration': 694,
'age_limit': 0,
},
'id': '3519514',
'ext': 'mp4',
'title': 'Joe Dirt 2 Beautiful Loser Teaser Trailer',
- 'thumbnail': 're:^https?://.*\.png$',
+ 'thumbnail': r're:^https?://.*\.png$',
'duration': 45.115,
},
},
'id': '300346',
'ext': 'mp4',
'title': '中一中男師變性 全校師生力挺',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
},
'params': {
# m3u8 download
'ext': 'mp4',
'title': 'Sauvons les abeilles ! - Le débat',
'description': 'md5:d9082128b1c5277987825d684939ca26',
- 'thumbnail': 're:^https?://.*\.jpe?g$',
+ 'thumbnail': r're:^https?://.*\.jpe?g$',
'timestamp': 1434970506,
'upload_date': '20150622',
'uploader': 'Public Sénat',
'id': '2855',
'ext': 'mp4',
'title': 'Don’t Understand Bitcoin? This Man Will Mumble An Explanation At You',
- 'thumbnail': 're:^https?://.*\.jpe?g$',
+ 'thumbnail': r're:^https?://.*\.jpe?g$',
'uploader': 'ClickHole',
'uploader_id': 'clickhole',
}
},
'playlist_mincount': 3,
},
+ {
+ # Videa embeds
+ 'url': 'http://forum.dvdtalk.com/movie-talk/623756-deleted-magic-star-wars-ot-deleted-alt-scenes-docu-style.html',
+ 'info_dict': {
+ 'id': '623756-deleted-magic-star-wars-ot-deleted-alt-scenes-docu-style',
+ 'title': 'Deleted Magic - Star Wars: OT Deleted / Alt. Scenes Docu. Style - DVD Talk Forum',
+ },
+ 'playlist_mincount': 2,
+ },
+ {
+ # 20 minuten embed
+ 'url': 'http://www.20min.ch/schweiz/news/story/So-kommen-Sie-bei-Eis-und-Schnee-sicher-an-27032552',
+ 'info_dict': {
+ 'id': '523629',
+ 'ext': 'mp4',
+ 'title': 'So kommen Sie bei Eis und Schnee sicher an',
+ 'description': 'md5:117c212f64b25e3d95747e5276863f7d',
+ },
+ 'params': {
+ 'skip_download': True,
+ },
+ 'add_ie': [TwentyMinutenIE.ie_key()],
+ },
+ {
+ # VideoPress embed
+ 'url': 'https://en.support.wordpress.com/videopress/',
+ 'info_dict': {
+ 'id': 'OcobLTqC',
+ 'ext': 'm4v',
+ 'title': 'IMG_5786',
+ 'timestamp': 1435711927,
+ 'upload_date': '20150701',
+ },
+ 'params': {
+ 'skip_download': True,
+ },
+ 'add_ie': [VideoPressIE.ie_key()],
+ }
# {
# # TODO: find another test
# # http://schema.org/VideoObject
re.search(r'SBN\.VideoLinkset\.ooyala\([\'"](?P<ec>.{32})[\'"]\)', webpage) or
re.search(r'data-ooyala-video-id\s*=\s*[\'"](?P<ec>.{32})[\'"]', webpage))
if mobj is not None:
- return OoyalaIE._build_url_result(smuggle_url(mobj.group('ec'), {'domain': url}))
+ embed_token = self._search_regex(
+ r'embedToken[\'"]?\s*:\s*[\'"]([^\'"]+)',
+ webpage, 'ooyala embed token', default=None)
+ return OoyalaIE._build_url_result(smuggle_url(
+ mobj.group('ec'), {
+ 'domain': url,
+ 'embed_token': embed_token,
+ }))
# Look for multiple Ooyala embeds on SBN network websites
mobj = re.search(r'SBN\.VideoLinkset\.entryGroup\((\[.*?\])', webpage)
return self.url_result(mobj.group('url'), 'TED')
# Look for embedded Ustream videos
- mobj = re.search(
- r'<iframe[^>]+?src=(["\'])(?P<url>http://www\.ustream\.tv/embed/.+?)\1', webpage)
- if mobj is not None:
- return self.url_result(mobj.group('url'), 'Ustream')
+ ustream_url = UstreamIE._extract_url(webpage)
+ if ustream_url:
+ return self.url_result(ustream_url, UstreamIE.ie_key())
# Look for embedded arte.tv player
mobj = re.search(
if soundcloud_urls:
return _playlist_from_matches(soundcloud_urls, getter=unescapeHTML, ie=SoundcloudIE.ie_key())
+ # Look for tunein player
+ tunein_urls = TuneInBaseIE._extract_urls(webpage)
+ if tunein_urls:
+ return _playlist_from_matches(tunein_urls)
+
# Look for embedded mtvservices player
mtvservices_url = MTVServicesEmbeddedIE._extract_url(webpage)
if mtvservices_url:
if arkena_url:
return self.url_result(arkena_url, ArkenaIE.ie_key())
+ # Look for Piksel embeds
+ piksel_url = PikselIE._extract_url(webpage)
+ if piksel_url:
+ return self.url_result(piksel_url, PikselIE.ie_key())
+
# Look for Limelight embeds
mobj = re.search(r'LimelightPlayer\.doLoad(Media|Channel|ChannelList)\(["\'](?P<id>[a-z0-9]{32})', webpage)
if mobj:
if dbtv_urls:
return _playlist_from_matches(dbtv_urls, ie=DBTVIE.ie_key())
+ # Look for Videa embeds
+ videa_urls = VideaIE._extract_urls(webpage)
+ if videa_urls:
+ return _playlist_from_matches(videa_urls, ie=VideaIE.ie_key())
+
+ # Look for 20 minuten embeds
+ twentymin_urls = TwentyMinutenIE._extract_urls(webpage)
+ if twentymin_urls:
+ return _playlist_from_matches(
+ twentymin_urls, ie=TwentyMinutenIE.ie_key())
+
+ # Look for Openload embeds
+ openload_urls = OpenloadIE._extract_urls(webpage)
+ if openload_urls:
+ return _playlist_from_matches(
+ openload_urls, ie=OpenloadIE.ie_key())
+
+ # Look for VideoPress embeds
+ videopress_urls = VideoPressIE._extract_urls(webpage)
+ if videopress_urls:
+ return _playlist_from_matches(
+ videopress_urls, ie=VideoPressIE.ie_key())
+
# Looking for http://schema.org/VideoObject
json_ld = self._search_json_ld(
webpage, video_id, default={}, expected_type='VideoObject')
def check_video(vurl):
if YoutubeIE.suitable(vurl):
return True
+ if RtmpIE.suitable(vurl):
+ return True
vpath = compat_urlparse.urlparse(vurl).path
vext = determine_ext(vpath)
return '.' in vpath and vext not in ('swf', 'png', 'jpg', 'srt', 'sbv', 'sub', 'vtt', 'ttml', 'js')
'age_limit': age_limit,
}
+ if RtmpIE.suitable(video_url):
+ entry_info_dict.update({
+ '_type': 'url_transparent',
+ 'ie_key': RtmpIE.ie_key(),
+ 'url': video_url,
+ })
+ entries.append(entry_info_dict)
+ continue
+
ext = determine_ext(video_url)
if ext == 'smil':
entry_info_dict['formats'] = self._extract_smil_formats(video_url, video_id)
'title': 'Quick Look: Destiny: The Dark Below',
'description': 'md5:0aa3aaf2772a41b91d44c63f30dfad24',
'duration': 2399,
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
}
}
'ext': 'mp4',
'title': 'Anime Awesome: Chihiros Reise ins Zauberland – Das Beste kommt zum Schluss',
'description': 'md5:afdf5862241aded4718a30dff6a57baf',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'duration': 578,
'timestamp': 1414749706,
'upload_date': '20141031',
'id': 'UZF8zlmuQbe4mr+7dCiQ0w==',
'ext': 'mp4',
'title': "Damon's Glide message",
- 'thumbnail': 're:^https?://.*?\.cloudfront\.net/.*\.jpg$',
+ 'thumbnail': r're:^https?://.*?\.cloudfront\.net/.*\.jpg$',
}
}
sub_domain, video_id, display_id = re.match(self._VALID_URL, url).groups()
if not video_id:
webpage = self._download_webpage(url, display_id)
- video_id = self._search_regex(r'data-video-id=["\']VDKA(\w+)', webpage, 'video id')
+ video_id = self._search_regex(
+ # There may be inner quotes, e.g. data-video-id="'VDKA3609139'"
+ # from http://freeform.go.com/shows/shadowhunters/episodes/season-2/1-this-guilty-blood
+ r'data-video-id=["\']*VDKA(\w+)', webpage, 'video id')
brand = self._BRANDS[sub_domain]
video_data = self._download_json(
'http://api.contents.watchabc.go.com/vp2/ws/contents/3000/videos/%s/001/-1/-1/-1/%s/-1/-1.json' % (brand, video_id),
'timestamp': 1205712000,
'uploader': 'beverlybmusic',
'upload_date': '20080317',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
},
},
]
from ..utils import (
ExtractorError,
int_or_none,
+ lowercase_escape,
)
_VALID_URL = r'https?://(?:(?:docs|drive)\.google\.com/(?:uc\?.*?id=|file/d/)|video\.google\.com/get_player\?.*?docid=)(?P<id>[a-zA-Z0-9_-]{28,})'
_TESTS = [{
'url': 'https://drive.google.com/file/d/0ByeS4oOUV-49Zzh4R1J6R09zazQ/edit?pli=1',
- 'md5': '881f7700aec4f538571fa1e0eed4a7b6',
+ 'md5': 'd109872761f7e7ecf353fa108c0dbe1e',
'info_dict': {
'id': '0ByeS4oOUV-49Zzh4R1J6R09zazQ',
'ext': 'mp4',
'title': 'Big Buck Bunny.mp4',
- 'duration': 46,
+ 'duration': 45,
}
}, {
# video id is longer than 28 characters
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(
- 'http://docs.google.com/file/d/%s' % video_id, video_id, encoding='unicode_escape')
+ 'http://docs.google.com/file/d/%s' % video_id, video_id)
reason = self._search_regex(r'"reason"\s*,\s*"([^"]+)', webpage, 'reason', default=None)
if reason:
resolution = fmt.split('/')[1]
width, height = resolution.split('x')
formats.append({
- 'url': fmt_url,
+ 'url': lowercase_escape(fmt_url),
'format_id': fmt_id,
'resolution': resolution,
'width': int_or_none(width),
'id': '299069',
'ext': 'flv',
'title': 'DIESEL SFW XXX Video',
- 'thumbnail': 're:^http://.*\.jpg$',
+ 'thumbnail': r're:^http://.*\.jpg$',
'duration': 80,
'age_limit': 18,
}
'id': '1437839',
'ext': 'mp4',
'title': 'Ep. 64 Clip: Encryption',
- 'thumbnail': 're:https?://.*\.jpg$',
+ 'thumbnail': r're:https?://.*\.jpg$',
'duration': 1072,
}
}
'display_id': 'ep-52-inside-the-episode',
'ext': 'mp4',
'title': 'Ep. 52: Inside the Episode',
- 'thumbnail': 're:https?://.*\.jpg$',
+ 'thumbnail': r're:https?://.*\.jpg$',
'duration': 240,
},
}, {
'id': '150939',
'ext': 'wav',
'title': 'Moofi - Dr. Kreep',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'timestamp': 1421564134,
'description': 'Listen to Dr. Kreep by Moofi on hearthis.at - Modular, Eurorack, Mutable Intruments Braids, Valhalla-DSP',
'upload_date': '20150118',
'description': 'Listen to DJ Jim Hopkins - Totally Bitchin\' 80\'s Dance Mix! by TwitchSF on hearthis.at - Dance',
'upload_date': '20160328',
'timestamp': 1459186146,
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'comment_count': int,
'view_count': int,
'like_count': int,
'timestamp': 1411812600,
'upload_date': '20140927',
'description': 'In uplink-Episode 3.3 geht es darum, wie man sich von Cloud-Anbietern emanzipieren kann, worauf man beim Kauf einer Tastatur achten sollte und was Smartphones über uns verraten.',
- 'thumbnail': 're:^https?://.*\.jpe?g$',
+ 'thumbnail': r're:^https?://.*\.jpe?g$',
}
}
'display_id': 'dixie-is-posing-with-naked-ass-very-erotic',
'ext': 'mp4',
'title': 'Dixie is posing with naked ass very erotic',
- 'thumbnail': 're:https?://.*\.jpg$',
+ 'thumbnail': r're:https?://.*\.jpg$',
'age_limit': 18,
}
}, {
'ext': 'mov',
'title': 'Historic Films: GP-7',
'description': 'md5:1a86a0f3ac54024e419aba97210d959a',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'duration': 2096,
},
}
'alt_title': 'hitboxlive - Aug 9th #6',
'description': '',
'ext': 'mp4',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'duration': 215.1666,
'resolution': 'HD 720p',
'uploader': 'hitboxlive',
if cdn.get('rtmpSubscribe') is True:
continue
base_url = cdn.get('netConnectionUrl')
- host = re.search('.+\.([^\.]+\.[^\./]+)/.+', base_url).group(1)
+ host = re.search(r'.+\.([^\.]+\.[^\./]+)/.+', base_url).group(1)
if base_url not in servers:
servers.append(base_url)
for stream in cdn.get('bitrates'):
--- /dev/null
+from __future__ import unicode_literals
+
+from .common import InfoExtractor
+from ..compat import compat_str
+from ..utils import (
+ clean_html,
+ float_or_none,
+ int_or_none,
+ try_get,
+)
+
+
+class HitRecordIE(InfoExtractor):
+ _VALID_URL = r'https?://(?:www\.)?hitrecord\.org/records/(?P<id>\d+)'
+ _TEST = {
+ 'url': 'https://hitrecord.org/records/2954362',
+ 'md5': 'fe1cdc2023bce0bbb95c39c57426aa71',
+ 'info_dict': {
+ 'id': '2954362',
+ 'ext': 'mp4',
+ 'title': 'A Very Different World (HITRECORD x ACLU)',
+ 'description': 'md5:e62defaffab5075a5277736bead95a3d',
+ 'duration': 139.327,
+ 'timestamp': 1471557582,
+ 'upload_date': '20160818',
+ 'uploader': 'Zuzi.C12',
+ 'uploader_id': '362811',
+ 'view_count': int,
+ 'like_count': int,
+ 'comment_count': int,
+ 'tags': list,
+ }
+ }
+
+ def _real_extract(self, url):
+ video_id = self._match_id(url)
+
+ video = self._download_json(
+ 'https://hitrecord.org/api/web/records/%s' % video_id, video_id)
+
+ title = video['title']
+ video_url = video['source_url']['mp4_url']
+
+ tags = None
+ tags_list = try_get(video, lambda x: x['tags'], list)
+ if tags_list:
+ tags = [
+ t['text']
+ for t in tags_list
+ if isinstance(t, dict) and t.get('text') and
+ isinstance(t['text'], compat_str)]
+
+ return {
+ 'id': video_id,
+ 'url': video_url,
+ 'title': title,
+ 'description': clean_html(video.get('body')),
+ 'duration': float_or_none(video.get('duration'), 1000),
+ 'timestamp': int_or_none(video.get('created_at_i')),
+ 'uploader': try_get(
+ video, lambda x: x['user']['username'], compat_str),
+ 'uploader_id': try_get(
+ video, lambda x: compat_str(x['user']['id'])),
+ 'view_count': int_or_none(video.get('total_views_count')),
+ 'like_count': int_or_none(video.get('hearts_count')),
+ 'comment_count': int_or_none(video.get('comments_count')),
+ 'tags': tags,
+ }
'duration': 550,
'age_limit': 18,
'view_count': int,
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
}
}
'title': 'Cool Jobs - Iditarod Musher',
'description': 'Cold sleds, freezing temps and warm dog breath... an Iditarod musher\'s dream. Kasey-Dee Gardner jumps on a sled to find out what the big deal is.',
'display_id': 'cool-jobs-iditarod-musher',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'duration': 161,
},
'skip': 'Video broken',
'title': 'Survival Zone: Food and Water In the Savanna',
'description': 'Learn how to find both food and water while trekking in the African savannah. In this video from the Discovery Channel.',
'display_id': 'survival-zone-food-and-water-in-the-savanna',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
},
},
{
'title': 'Sword Swallowing #1 by Dan Meyer',
'description': 'Video footage (1 of 3) used by permission of the owner Dan Meyer through Sword Swallowers Association International <www.swordswallow.org>',
'display_id': 'sword-swallowing-1-by-dan-meyer',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
},
},
{
'title': '#新人求关注#',
'description': 're:.*',
'duration': 2424.0,
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'timestamp': 1475866459,
'upload_date': '20161007',
'uploader': 'Penny_余姿昀',
thumbnails = []
for url in filter(None, data['images'].values()):
- m = re.match('.*-([0-9]+x[0-9]+)\.', url)
+ m = re.match(r'.*-([0-9]+x[0-9]+)\.', url)
if not m:
continue
thumbnails.append({
class ImdbIE(InfoExtractor):
IE_NAME = 'imdb'
IE_DESC = 'Internet Movie Database trailers'
- _VALID_URL = r'https?://(?:www|m)\.imdb\.com/(?:video/[^/]+/|title/tt\d+.*?#lb-)vi(?P<id>\d+)'
+ _VALID_URL = r'https?://(?:www|m)\.imdb\.com/(?:video/[^/]+/|title/tt\d+.*?#lb-|videoplayer/)vi(?P<id>\d+)'
_TESTS = [{
'url': 'http://www.imdb.com/video/imdb/vi2524815897',
}, {
'url': 'http://www.imdb.com/title/tt1667889/#lb-vi2524815897',
'only_matching': True,
+ }, {
+ 'url': 'http://www.imdb.com/videoplayer/vi1562949145',
+ 'only_matching': True,
}]
def _real_extract(self, url):
--- /dev/null
+from __future__ import unicode_literals
+
+from .common import InfoExtractor
+from .kaltura import KalturaIE
+
+
+class IncIE(InfoExtractor):
+ _VALID_URL = r'https?://(?:www\.)?inc\.com/(?:[^/]+/)+(?P<id>[^.]+).html'
+ _TESTS = [{
+ 'url': 'http://www.inc.com/tip-sheet/bill-gates-says-these-5-books-will-make-you-smarter.html',
+ 'md5': '7416739c9c16438c09fa35619d6ba5cb',
+ 'info_dict': {
+ 'id': '1_wqig47aq',
+ 'ext': 'mov',
+ 'title': 'Bill Gates Says These 5 Books Will Make You Smarter',
+ 'description': 'md5:bea7ff6cce100886fc1995acb743237e',
+ 'timestamp': 1474414430,
+ 'upload_date': '20160920',
+ 'uploader_id': 'video@inc.com',
+ },
+ 'params': {
+ 'skip_download': True,
+ },
+ }, {
+ 'url': 'http://www.inc.com/video/david-whitford/founders-forum-tripadvisor-steve-kaufer-most-enjoyable-moment-for-entrepreneur.html',
+ 'only_matching': True,
+ }]
+
+ def _real_extract(self, url):
+ display_id = self._match_id(url)
+ webpage = self._download_webpage(url, display_id)
+
+ partner_id = self._search_regex(
+ r'var\s+_?bizo_data_partner_id\s*=\s*["\'](\d+)', webpage, 'partner id')
+
+ kaltura_id = self._parse_json(self._search_regex(
+ r'pageInfo\.videos\s*=\s*\[(.+)\];', webpage, 'kaltura id'),
+ display_id)['vid_kaltura_id']
+
+ return self.url_result(
+ 'kaltura:%s:%s' % (partner_id, kaltura_id), KalturaIE.ie_key())
'ext': 'mp4',
'title': 'Cicatánc',
'description': '',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'uploader': 'cukiajanlo',
'uploader_id': '83729',
'timestamp': 1439193826,
'ext': 'mp4',
'title': 'Vicces cica',
'description': 'Játszik a tablettel. :D',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'uploader': 'Jet_Pack',
'uploader_id': '491217',
'timestamp': 1390821212,
import base64
-from ..compat import compat_urllib_parse_unquote
+from ..compat import (
+ compat_urllib_parse_unquote,
+ compat_urlparse,
+)
from ..utils import determine_ext
from .bokecc import BokeCCBaseIE
'ext': 'flv',
'description': 'md5:308d981fb28fa42f49f9568322c683ff',
},
+ }, {
+ 'url': 'https://www.infoq.com/presentations/Simple-Made-Easy',
+ 'md5': '0e34642d4d9ef44bf86f66f6399672db',
+ 'info_dict': {
+ 'id': 'Simple-Made-Easy',
+ 'title': 'Simple Made Easy',
+ 'ext': 'mp3',
+ 'description': 'md5:3e0e213a8bbd074796ef89ea35ada25b',
+ },
+ 'params': {
+ 'format': 'bestaudio',
+ },
}]
- def _extract_rtmp_videos(self, webpage):
+ def _extract_rtmp_video(self, webpage):
# The server URL is hardcoded
video_url = 'rtmpe://video.infoq.com/cfx/st/'
playpath = 'mp4:' + real_id
return [{
- 'format_id': 'rtmp',
+ 'format_id': 'rtmp_video',
'url': video_url,
'ext': determine_ext(playpath),
'play_path': playpath,
}]
- def _extract_http_videos(self, webpage):
- http_video_url = self._search_regex(r'P\.s\s*=\s*\'([^\']+)\'', webpage, 'video URL')
-
+ def _extract_cookies(self, webpage):
policy = self._search_regex(r'InfoQConstants.scp\s*=\s*\'([^\']+)\'', webpage, 'policy')
signature = self._search_regex(r'InfoQConstants.scs\s*=\s*\'([^\']+)\'', webpage, 'signature')
key_pair_id = self._search_regex(r'InfoQConstants.sck\s*=\s*\'([^\']+)\'', webpage, 'key-pair-id')
+ return 'CloudFront-Policy=%s; CloudFront-Signature=%s; CloudFront-Key-Pair-Id=%s' % (
+ policy, signature, key_pair_id)
+ def _extract_http_video(self, webpage):
+ http_video_url = self._search_regex(r'P\.s\s*=\s*\'([^\']+)\'', webpage, 'video URL')
return [{
- 'format_id': 'http',
+ 'format_id': 'http_video',
'url': http_video_url,
'http_headers': {
- 'Cookie': 'CloudFront-Policy=%s; CloudFront-Signature=%s; CloudFront-Key-Pair-Id=%s' % (
- policy, signature, key_pair_id),
+ 'Cookie': self._extract_cookies(webpage)
},
}]
+ def _extract_http_audio(self, webpage, video_id):
+ fields = self._hidden_inputs(webpage)
+ http_audio_url = fields['filename']
+ if http_audio_url is None:
+ return []
+
+ cookies_header = {'Cookie': self._extract_cookies(webpage)}
+
+ # base URL is found in the Location header in the response returned by
+ # GET https://www.infoq.com/mp3download.action?filename=... when logged in.
+ http_audio_url = compat_urlparse.urljoin('http://res.infoq.com/downloads/mp3downloads/', http_audio_url)
+
+ # audio file seem to be missing some times even if there is a download link
+ # so probe URL to make sure
+ if not self._is_valid_url(http_audio_url, video_id, headers=cookies_header):
+ return []
+
+ return [{
+ 'format_id': 'http_audio',
+ 'url': http_audio_url,
+ 'vcodec': 'none',
+ 'http_headers': cookies_header,
+ }]
+
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
# for China videos, HTTP video URL exists but always fails with 403
formats = self._extract_bokecc_formats(webpage, video_id)
else:
- formats = self._extract_rtmp_videos(webpage) + self._extract_http_videos(webpage)
+ formats = (
+ self._extract_rtmp_video(webpage) +
+ self._extract_http_video(webpage) +
+ self._extract_http_audio(webpage, video_id))
self._sort_formats(formats)
'ext': 'mp4',
'title': 'Video by naomipq',
'description': 'md5:1f17f0ab29bd6fe2bfad705f58de3cb8',
- 'thumbnail': 're:^https?://.*\.jpg',
+ 'thumbnail': r're:^https?://.*\.jpg',
'timestamp': 1371748545,
'upload_date': '20130620',
'uploader_id': 'naomipq',
'id': 'BA-pQFBG8HZ',
'ext': 'mp4',
'title': 'Video by britneyspears',
- 'thumbnail': 're:^https?://.*\.jpg',
+ 'thumbnail': r're:^https?://.*\.jpg',
'timestamp': 1453760977,
'upload_date': '20160125',
'uploader_id': 'britneyspears',
'id': '614605558512799803_462752227',
'ext': 'mp4',
'title': '#Porsche Intelligent Performance.',
- 'thumbnail': 're:^https?://.*\.jpg',
+ 'thumbnail': r're:^https?://.*\.jpg',
'uploader': 'Porsche',
'uploader_id': 'porsche',
'timestamp': 1387486713,
options = self._parse_json(
self._search_regex(
- r'(?s)var\s+playerOptions\s*=\s*({.+?});',
+ r'(?s)(?:TDIPlayerOptions|playerOptions)\s*=\s*({.+?});\s*\]\]',
playerpage, 'player options', default='{}'),
video_id, transform_source=js_to_json, fatal=False)
if options:
'id': '95719',
'ext': 'mp4',
'title': 'شایعات نقل و انتقالات مهم فوتبال اروپا 94/02/18',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
}
}, {
'url': 'http://www.90tv.ir/video/95719/%D8%B4%D8%A7%DB%8C%D8%B9%D8%A7%D8%AA-%D9%86%D9%82%D9%84-%D9%88-%D8%A7%D9%86%D8%AA%D9%82%D8%A7%D9%84%D8%A7%D8%AA-%D9%85%D9%87%D9%85-%D9%81%D9%88%D8%AA%D8%A8%D8%A7%D9%84-%D8%A7%D8%B1%D9%88%D9%BE%D8%A7-940218',
--- /dev/null
+# coding: utf-8
+from __future__ import unicode_literals
+
+import uuid
+import xml.etree.ElementTree as etree
+import json
+
+from .common import InfoExtractor
+from ..compat import (
+ compat_str,
+ compat_etree_register_namespace,
+)
+from ..utils import (
+ extract_attributes,
+ xpath_with_ns,
+ xpath_element,
+ xpath_text,
+ int_or_none,
+ parse_duration,
+ ExtractorError,
+ determine_ext,
+)
+
+
+class ITVIE(InfoExtractor):
+ _VALID_URL = r'https?://(?:www\.)?itv\.com/hub/[^/]+/(?P<id>[0-9a-zA-Z]+)'
+ _TEST = {
+ 'url': 'http://www.itv.com/hub/mr-bean-animated-series/2a2936a0053',
+ 'info_dict': {
+ 'id': '2a2936a0053',
+ 'ext': 'flv',
+ 'title': 'Home Movie',
+ },
+ 'params': {
+ # rtmp download
+ 'skip_download': True,
+ },
+ }
+
+ def _real_extract(self, url):
+ video_id = self._match_id(url)
+ webpage = self._download_webpage(url, video_id)
+ params = extract_attributes(self._search_regex(
+ r'(?s)(<[^>]+id="video"[^>]*>)', webpage, 'params'))
+
+ ns_map = {
+ 'soapenv': 'http://schemas.xmlsoap.org/soap/envelope/',
+ 'tem': 'http://tempuri.org/',
+ 'itv': 'http://schemas.datacontract.org/2004/07/Itv.BB.Mercury.Common.Types',
+ 'com': 'http://schemas.itv.com/2009/05/Common',
+ }
+ for ns, full_ns in ns_map.items():
+ compat_etree_register_namespace(ns, full_ns)
+
+ def _add_ns(name):
+ return xpath_with_ns(name, ns_map)
+
+ def _add_sub_element(element, name):
+ return etree.SubElement(element, _add_ns(name))
+
+ req_env = etree.Element(_add_ns('soapenv:Envelope'))
+ _add_sub_element(req_env, 'soapenv:Header')
+ body = _add_sub_element(req_env, 'soapenv:Body')
+ get_playlist = _add_sub_element(body, ('tem:GetPlaylist'))
+ request = _add_sub_element(get_playlist, 'tem:request')
+ _add_sub_element(request, 'itv:ProductionId').text = params['data-video-id']
+ _add_sub_element(request, 'itv:RequestGuid').text = compat_str(uuid.uuid4()).upper()
+ vodcrid = _add_sub_element(request, 'itv:Vodcrid')
+ _add_sub_element(vodcrid, 'com:Id')
+ _add_sub_element(request, 'itv:Partition')
+ user_info = _add_sub_element(get_playlist, 'tem:userInfo')
+ _add_sub_element(user_info, 'itv:Broadcaster').text = 'Itv'
+ _add_sub_element(user_info, 'itv:DM')
+ _add_sub_element(user_info, 'itv:RevenueScienceValue')
+ _add_sub_element(user_info, 'itv:SessionId')
+ _add_sub_element(user_info, 'itv:SsoToken')
+ _add_sub_element(user_info, 'itv:UserToken')
+ site_info = _add_sub_element(get_playlist, 'tem:siteInfo')
+ _add_sub_element(site_info, 'itv:AdvertisingRestriction').text = 'None'
+ _add_sub_element(site_info, 'itv:AdvertisingSite').text = 'ITV'
+ _add_sub_element(site_info, 'itv:AdvertisingType').text = 'Any'
+ _add_sub_element(site_info, 'itv:Area').text = 'ITVPLAYER.VIDEO'
+ _add_sub_element(site_info, 'itv:Category')
+ _add_sub_element(site_info, 'itv:Platform').text = 'DotCom'
+ _add_sub_element(site_info, 'itv:Site').text = 'ItvCom'
+ device_info = _add_sub_element(get_playlist, 'tem:deviceInfo')
+ _add_sub_element(device_info, 'itv:ScreenSize').text = 'Big'
+ player_info = _add_sub_element(get_playlist, 'tem:playerInfo')
+ _add_sub_element(player_info, 'itv:Version').text = '2'
+
+ headers = self.geo_verification_headers()
+ headers.update({
+ 'Content-Type': 'text/xml; charset=utf-8',
+ 'SOAPAction': 'http://tempuri.org/PlaylistService/GetPlaylist',
+ })
+ resp_env = self._download_xml(
+ params['data-playlist-url'], video_id,
+ headers=headers, data=etree.tostring(req_env))
+ playlist = xpath_element(resp_env, './/Playlist')
+ if playlist is None:
+ fault_string = xpath_text(resp_env, './/faultstring')
+ raise ExtractorError('%s said: %s' % (self.IE_NAME, fault_string))
+ title = xpath_text(playlist, 'EpisodeTitle', fatal=True)
+ video_element = xpath_element(playlist, 'VideoEntries/Video', fatal=True)
+ media_files = xpath_element(video_element, 'MediaFiles', fatal=True)
+ rtmp_url = media_files.attrib['base']
+
+ formats = []
+ for media_file in media_files.findall('MediaFile'):
+ play_path = xpath_text(media_file, 'URL')
+ if not play_path:
+ continue
+ tbr = int_or_none(media_file.get('bitrate'), 1000)
+ formats.append({
+ 'format_id': 'rtmp' + ('-%d' % tbr if tbr else ''),
+ 'url': rtmp_url,
+ 'play_path': play_path,
+ 'tbr': tbr,
+ 'ext': 'flv',
+ })
+
+ ios_playlist_url = params.get('data-video-playlist')
+ hmac = params.get('data-video-hmac')
+ if ios_playlist_url and hmac:
+ headers = self.geo_verification_headers()
+ headers.update({
+ 'Accept': 'application/vnd.itv.vod.playlist.v2+json',
+ 'Content-Type': 'application/json',
+ 'hmac': hmac.upper(),
+ })
+ ios_playlist = self._download_json(
+ ios_playlist_url, video_id, data=json.dumps({
+ 'user': {
+ 'itvUserId': '',
+ 'entitlements': [],
+ 'token': ''
+ },
+ 'device': {
+ 'manufacturer': 'Apple',
+ 'model': 'iPad',
+ 'os': {
+ 'name': 'iPhone OS',
+ 'version': '9.3',
+ 'type': 'ios'
+ }
+ },
+ 'client': {
+ 'version': '4.1',
+ 'id': 'browser'
+ },
+ 'variantAvailability': {
+ 'featureset': {
+ 'min': ['hls', 'aes'],
+ 'max': ['hls', 'aes']
+ },
+ 'platformTag': 'mobile'
+ }
+ }).encode(), headers=headers, fatal=False)
+ if ios_playlist:
+ video_data = ios_playlist.get('Playlist', {}).get('Video', {})
+ ios_base_url = video_data.get('Base')
+ for media_file in video_data.get('MediaFiles', []):
+ href = media_file.get('Href')
+ if not href:
+ continue
+ if ios_base_url:
+ href = ios_base_url + href
+ ext = determine_ext(href)
+ if ext == 'm3u8':
+ formats.extend(self._extract_m3u8_formats(href, video_id, 'mp4', m3u8_id='hls', fatal=False))
+ else:
+ formats.append({
+ 'url': href,
+ })
+ self._sort_formats(formats)
+
+ subtitles = {}
+ for caption_url in video_element.findall('ClosedCaptioningURIs/URL'):
+ if not caption_url.text:
+ continue
+ ext = determine_ext(caption_url.text, 'ttml')
+ subtitles.setdefault('en', []).append({
+ 'url': caption_url.text,
+ 'ext': 'ttml' if ext == 'xml' else ext,
+ })
+
+ return {
+ 'id': video_id,
+ 'title': title,
+ 'formats': formats,
+ 'subtitles': subtitles,
+ 'episode_title': title,
+ 'episode_number': int_or_none(xpath_text(playlist, 'EpisodeNumber')),
+ 'series': xpath_text(playlist, 'ProgrammeTitle'),
+ 'duartion': parse_duration(xpath_text(playlist, 'Duration')),
+ }
'title': 'Иван Васильевич меняет профессию',
'description': 'md5:b924063ea1677c8fe343d8a72ac2195f',
'duration': 5498,
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
},
'skip': 'Only works from Russia',
},
'episode': 'Дело Гольдберга (1 часть)',
'episode_number': 1,
'duration': 2655,
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
},
'skip': 'Only works from Russia',
},
'title': 'Кукла',
'description': 'md5:ffca9372399976a2d260a407cc74cce6',
'duration': 5599,
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
},
'skip': 'Only works from Russia',
}
from .common import InfoExtractor
from ..compat import compat_urllib_parse_urlparse
-from ..utils import remove_end
+from ..utils import (
+ int_or_none,
+ mimetype2ext,
+ remove_end,
+)
class IwaraIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.|ecchi\.)?iwara\.tv/videos/(?P<id>[a-zA-Z0-9]+)'
_TESTS = [{
'url': 'http://iwara.tv/videos/amVwUl1EHpAD9RD',
- 'md5': '1d53866b2c514b23ed69e4352fdc9839',
+ # md5 is unstable
'info_dict': {
'id': 'amVwUl1EHpAD9RD',
'ext': 'mp4',
'info_dict': {
'id': '0B1LvuHnL-sRFNXB1WHNqbGw4SXc',
'ext': 'mp4',
- 'title': '[3D Hentai] Kyonyu Ã\83\x97 Genkai Ã\83\x97 Emaki Shinobi Girls.mp4',
+ 'title': '[3D Hentai] Kyonyu Ã\97 Genkai Ã\97 Emaki Shinobi Girls.mp4',
'age_limit': 18,
},
'add_ie': ['GoogleDrive'],
}, {
'url': 'http://www.iwara.tv/videos/nawkaumd6ilezzgq',
- 'md5': '1d85f1e5217d2791626cff5ec83bb189',
+ # md5 is unstable
'info_dict': {
'id': '6liAP9s2Ojc',
'ext': 'mp4',
- 'age_limit': 0,
+ 'age_limit': 18,
'title': '[MMD] Do It Again Ver.2 [1080p 60FPS] (Motion,Camera,Wav+DL)',
'description': 'md5:590c12c0df1443d833fbebe05da8c47a',
'upload_date': '20160910',
# ecchi is 'sexy' in Japanese
age_limit = 18 if hostname.split('.')[0] == 'ecchi' else 0
- entries = self._parse_html5_media_entries(url, webpage, video_id)
+ video_data = self._download_json('http://www.iwara.tv/api/video/%s' % video_id, video_id)
- if not entries:
+ if not video_data:
iframe_url = self._html_search_regex(
r'<iframe[^>]+src=([\'"])(?P<url>[^\'"]+)\1',
webpage, 'iframe URL', group='url')
title = remove_end(self._html_search_regex(
r'<title>([^<]+)</title>', webpage, 'title'), ' | Iwara')
- info_dict = entries[0]
- info_dict.update({
+ formats = []
+ for a_format in video_data:
+ format_id = a_format.get('resolution')
+ height = int_or_none(self._search_regex(
+ r'(\d+)p', format_id, 'height', default=None))
+ formats.append({
+ 'url': a_format['uri'],
+ 'format_id': format_id,
+ 'ext': mimetype2ext(a_format.get('mime')) or 'mp4',
+ 'height': height,
+ 'width': int_or_none(height / 9.0 * 16.0 if height else None),
+ 'quality': 1 if format_id == 'Source' else 0,
+ })
+
+ self._sort_formats(formats)
+
+ return {
'id': video_id,
'title': title,
'age_limit': age_limit,
- })
-
- return info_dict
+ 'formats': formats,
+ }
'ext': 'mp4',
'title': 'Sevinçten Çıldırtan Doğum Günü Hediyesi',
'description': 'md5:253753e2655dde93f59f74b572454f6d',
- 'thumbnail': 're:^https?://.*\.jpg',
+ 'thumbnail': r're:^https?://.*\.jpg',
'uploader_id': 'pelikzzle',
'timestamp': int,
'upload_date': '20140702',
'id': '17997',
'ext': 'mp4',
'title': 'Tarkan Dortmund 2006 Konseri',
- 'thumbnail': 're:^https://.*\.jpg',
+ 'thumbnail': r're:^https://.*\.jpg',
'uploader_id': 'parlayankiz',
'timestamp': int,
'upload_date': '20061112',
from ..compat import compat_urlparse
from .common import InfoExtractor
-
-
-class JamendoIE(InfoExtractor):
+from ..utils import parse_duration
+
+
+class JamendoBaseIE(InfoExtractor):
+ def _extract_meta(self, webpage, fatal=True):
+ title = self._og_search_title(
+ webpage, default=None) or self._search_regex(
+ r'<title>([^<]+)', webpage,
+ 'title', default=None)
+ if title:
+ title = self._search_regex(
+ r'(.+?)\s*\|\s*Jamendo Music', title, 'title', default=None)
+ if not title:
+ title = self._html_search_meta(
+ 'name', webpage, 'title', fatal=fatal)
+ mobj = re.search(r'(.+) - (.+)', title or '')
+ artist, second = mobj.groups() if mobj else [None] * 2
+ return title, artist, second
+
+
+class JamendoIE(JamendoBaseIE):
_VALID_URL = r'https?://(?:www\.)?jamendo\.com/track/(?P<id>[0-9]+)/(?P<display_id>[^/?#&]+)'
_TEST = {
'url': 'https://www.jamendo.com/track/196219/stories-from-emona-i',
'id': '196219',
'display_id': 'stories-from-emona-i',
'ext': 'flac',
- 'title': 'Stories from Emona I',
- 'thumbnail': 're:^https?://.*\.jpg'
+ 'title': 'Maya Filipič - Stories from Emona I',
+ 'artist': 'Maya Filipič',
+ 'track': 'Stories from Emona I',
+ 'duration': 210,
+ 'thumbnail': r're:^https?://.*\.jpg'
}
}
webpage = self._download_webpage(url, display_id)
- title = self._html_search_meta('name', webpage, 'title')
+ title, artist, track = self._extract_meta(webpage)
formats = [{
'url': 'https://%s.jamendo.com/?trackid=%s&format=%s&from=app-97dab294'
thumbnail = self._html_search_meta(
'image', webpage, 'thumbnail', fatal=False)
+ duration = parse_duration(self._search_regex(
+ r'<span[^>]+itemprop=["\']duration["\'][^>]+content=["\'](.+?)["\']',
+ webpage, 'duration', fatal=False))
return {
'id': track_id,
'display_id': display_id,
'thumbnail': thumbnail,
'title': title,
+ 'duration': duration,
+ 'artist': artist,
+ 'track': track,
'formats': formats
}
-class JamendoAlbumIE(InfoExtractor):
+class JamendoAlbumIE(JamendoBaseIE):
_VALID_URL = r'https?://(?:www\.)?jamendo\.com/album/(?P<id>[0-9]+)/(?P<display_id>[\w-]+)'
_TEST = {
'url': 'https://www.jamendo.com/album/121486/duck-on-cover',
'info_dict': {
'id': '121486',
- 'title': 'Duck On Cover'
+ 'title': 'Shearer - Duck On Cover'
},
'playlist': [{
'md5': 'e1a2fcb42bda30dfac990212924149a8',
'info_dict': {
'id': '1032333',
'ext': 'flac',
- 'title': 'Warmachine'
+ 'title': 'Shearer - Warmachine',
+ 'artist': 'Shearer',
+ 'track': 'Warmachine',
}
}, {
'md5': '1f358d7b2f98edfe90fd55dac0799d50',
'info_dict': {
'id': '1032330',
'ext': 'flac',
- 'title': 'Without Your Ghost'
+ 'title': 'Shearer - Without Your Ghost',
+ 'artist': 'Shearer',
+ 'track': 'Without Your Ghost',
}
}],
'params': {
webpage = self._download_webpage(url, mobj.group('display_id'))
- title = self._html_search_meta('name', webpage, 'title')
-
- entries = [
- self.url_result(
- compat_urlparse.urljoin(url, m.group('path')),
- ie=JamendoIE.ie_key(),
- video_id=self._search_regex(
- r'/track/(\d+)', m.group('path'),
- 'track id', default=None))
- for m in re.finditer(
- r'<a[^>]+href=(["\'])(?P<path>(?:(?!\1).)+)\1[^>]+class=["\'][^>]*js-trackrow-albumpage-link',
- webpage)
- ]
+ title, artist, album = self._extract_meta(webpage, fatal=False)
+
+ entries = [{
+ '_type': 'url_transparent',
+ 'url': compat_urlparse.urljoin(url, m.group('path')),
+ 'ie_key': JamendoIE.ie_key(),
+ 'id': self._search_regex(
+ r'/track/(\d+)', m.group('path'), 'track id', default=None),
+ 'artist': artist,
+ 'album': album,
+ } for m in re.finditer(
+ r'<a[^>]+href=(["\'])(?P<path>(?:(?!\1).)+)\1[^>]+class=["\'][^>]*js-trackrow-albumpage-link',
+ webpage)]
return self.playlist_result(entries, album_id, title)
'ext': 'mp4',
'title': 'Electrode Positioning and Montage in Transcranial Direct Current Stimulation',
'description': 'md5:015dd4509649c0908bc27f049e0262c6',
- 'thumbnail': 're:^https?://.*\.png$',
+ 'thumbnail': r're:^https?://.*\.png$',
'upload_date': '20110523',
}
},
'ext': 'mp4',
'title': 'Culturing Caenorhabditis elegans in Axenic Liquid Media and Creation of Transgenic Worms by Microparticle Bombardment',
'description': 'md5:35ff029261900583970c4023b70f1dc9',
- 'thumbnail': 're:^https?://.*\.png$',
+ 'thumbnail': r're:^https?://.*\.png$',
'upload_date': '20140802',
}
},
int_or_none,
js_to_json,
mimetype2ext,
+ urljoin,
)
tracks = video_data.get('tracks')
if tracks and isinstance(tracks, list):
for track in tracks:
- if track.get('file') and track.get('kind') == 'captions':
- subtitles.setdefault(track.get('label') or 'en', []).append({
- 'url': self._proto_relative_url(track['file'])
- })
+ if track.get('kind') != 'captions':
+ continue
+ track_url = urljoin(base_url, track.get('file'))
+ if not track_url:
+ continue
+ subtitles.setdefault(track.get('label') or 'en', []).append({
+ 'url': self._proto_relative_url(track_url)
+ })
entries.append({
'id': this_video_id,
'description': video_data.get('description'),
'thumbnail': self._proto_relative_url(video_data.get('image')),
'timestamp': int_or_none(video_data.get('pubdate')),
- 'duration': float_or_none(jwplayer_data.get('duration')),
+ 'duration': float_or_none(jwplayer_data.get('duration') or video_data.get('duration')),
'subtitles': subtitles,
'formats': formats,
})
(?P<q1>['\"])wid(?P=q1)\s*:\s*
(?P<q2>['\"])_?(?P<partner_id>(?:(?!(?P=q2)).)+)(?P=q2),.*?
(?P<q3>['\"])entry_?[Ii]d(?P=q3)\s*:\s*
- (?P<q4>['\"])(?P<id>(?:(?!(?P=q4)).)+)(?P=q4),
+ (?P<q4>['\"])(?P<id>(?:(?!(?P=q4)).)+)(?P=q4)(?:,|\s*\})
""", webpage) or
re.search(
r'''(?xs)
# skip for now.
if f.get('fileExt') == 'chun':
continue
+ if not f.get('fileExt'):
+ # QT indicates QuickTime; some videos have broken fileExt
+ if f.get('containerFormat') == 'qt':
+ f['fileExt'] = 'mov'
+ else:
+ f['fileExt'] = 'mp4'
video_url = sign_url(
'%s/flavorId/%s' % (data_url, f['id']))
# audio-only has no videoCodecId (e.g. kaltura:1926081:0_c03e1b5g
'thumbnail': info.get('thumbnailUrl'),
'duration': info.get('duration'),
'timestamp': info.get('createdAt'),
- 'uploader_id': info.get('userId'),
+ 'uploader_id': info.get('userId') if info.get('userId') != 'None' else None,
'view_count': info.get('plays'),
}
'ext': 'flv',
'title': 'AltenpflegerIn',
'description': 'md5:dbadd1259fde2159a9b28667cb664ae2',
- 'thumbnail': 're:^http://.*\.png',
+ 'thumbnail': r're:^http://.*\.png',
},
'params': {
# rtmp download
'ext': 'flv',
'title': 'Väterkarenz und neue Chancen für Mütter - "Baby - was nun?"',
'description': 'md5:97092c6ad1fd7d38e9d6a5fdeb2bcc33',
- 'thumbnail': 're:^http://.*\.png',
+ 'thumbnail': r're:^http://.*\.png',
},
'params': {
# rtmp download
'display_id': 'petite-asian-lady-mai-playing-in-bathtub',
'ext': 'mp4',
'title': 'Petite Asian Lady Mai Playing In Bathtub',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'view_count': int,
'age_limit': 18,
}
'ext': 'mp4',
'title': 'Gluur mee op de filmset en op Pennenzakkenrock',
'description': 'Gluur mee met Ghost Rockers op de filmset',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
}
}, {
'url': 'https://www.ketnet.be/kijken/karrewiet/uitzending-8-september-2016',
from __future__ import unicode_literals
from .common import InfoExtractor
+from ..compat import compat_str
from ..utils import (
+ determine_ext,
float_or_none,
int_or_none,
)
class KonserthusetPlayIE(InfoExtractor):
- _VALID_URL = r'https?://(?:www\.)?konserthusetplay\.se/\?.*\bm=(?P<id>[^&]+)'
- _TEST = {
+ _VALID_URL = r'https?://(?:www\.)?(?:konserthusetplay|rspoplay)\.se/\?.*\bm=(?P<id>[^&]+)'
+ _TESTS = [{
'url': 'http://www.konserthusetplay.se/?m=CKDDnlCY-dhWAAqiMERd-A',
+ 'md5': 'e3fd47bf44e864bd23c08e487abe1967',
'info_dict': {
'id': 'CKDDnlCY-dhWAAqiMERd-A',
- 'ext': 'flv',
+ 'ext': 'mp4',
'title': 'Orkesterns instrument: Valthornen',
'description': 'md5:f10e1f0030202020396a4d712d2fa827',
'thumbnail': 're:^https?://.*$',
- 'duration': 398.8,
+ 'duration': 398.76,
},
- 'params': {
- # rtmp download
- 'skip_download': True,
- },
- }
+ }, {
+ 'url': 'http://rspoplay.se/?m=elWuEH34SMKvaO4wO_cHBw',
+ 'only_matching': True,
+ }]
def _real_extract(self, url):
video_id = self._match_id(url)
player_config = media['playerconfig']
playlist = player_config['playlist']
- source = next(f for f in playlist if f.get('bitrates'))
+ source = next(f for f in playlist if f.get('bitrates') or f.get('provider'))
FORMAT_ID_REGEX = r'_([^_]+)_h264m\.mp4'
formats = []
+ m3u8_url = source.get('url')
+ if m3u8_url and determine_ext(m3u8_url) == 'm3u8':
+ formats.extend(self._extract_m3u8_formats(
+ m3u8_url, video_id, 'mp4', entry_protocol='m3u8_native',
+ m3u8_id='hls', fatal=False))
+
fallback_url = source.get('fallbackUrl')
fallback_format_id = None
if fallback_url:
thumbnail = media.get('image')
duration = float_or_none(media.get('duration'), 1000)
+ subtitles = {}
+ captions = source.get('captionsAvailableLanguages')
+ if isinstance(captions, dict):
+ for lang, subtitle_url in captions.items():
+ if lang != 'none' and isinstance(subtitle_url, compat_str):
+ subtitles.setdefault(lang, []).append({'url': subtitle_url})
+
return {
'id': video_id,
'title': title,
'thumbnail': thumbnail,
'duration': duration,
'formats': formats,
+ 'subtitles': subtitles,
}
'title': 'Снег, лёд, заносы',
'description': 'Снято в городе Нягань, в Ханты-Мансийском автономном округе.',
'duration': 27,
- 'thumbnail': 're:^https?://.*\.jpg',
+ 'thumbnail': r're:^https?://.*\.jpg',
},
'params': {
'skip_download': 'Not accessible from Travis CI server',
'duration': 223.586,
'upload_date': '20160826',
'timestamp': 1472233118,
- 'thumbnail': 're:^https?://.*\.jpg$'
+ 'thumbnail': r're:^https?://.*\.jpg$'
},
}, {
'url': 'http://kusi.com/video?clipId=12203019',
# coding: utf-8
from __future__ import unicode_literals
-import re
-
from .common import InfoExtractor
-from ..compat import (
- compat_urllib_parse_urlencode,
- compat_urlparse,
-)
from ..utils import (
ExtractorError,
- sanitized_Request,
unified_strdate,
urlencode_postdata,
xpath_element,
xpath_text,
+ urljoin,
+ update_url_query,
)
+class Laola1TvEmbedIE(InfoExtractor):
+ IE_NAME = 'laola1tv:embed'
+ _VALID_URL = r'https?://(?:www\.)?laola1\.tv/titanplayer\.php\?.*?\bvideoid=(?P<id>\d+)'
+ _TEST = {
+ # flashvars.premium = "false";
+ 'url': 'https://www.laola1.tv/titanplayer.php?videoid=708065&type=V&lang=en&portal=int&customer=1024',
+ 'info_dict': {
+ 'id': '708065',
+ 'ext': 'mp4',
+ 'title': 'MA Long CHN - FAN Zhendong CHN',
+ 'uploader': 'ITTF - International Table Tennis Federation',
+ 'upload_date': '20161211',
+ },
+ }
+
+ def _real_extract(self, url):
+ video_id = self._match_id(url)
+ webpage = self._download_webpage(url, video_id)
+ flash_vars = self._search_regex(
+ r'(?s)flashvars\s*=\s*({.+?});', webpage, 'flash vars')
+
+ def get_flashvar(x, *args, **kwargs):
+ flash_var = self._search_regex(
+ r'%s\s*:\s*"([^"]+)"' % x,
+ flash_vars, x, default=None)
+ if not flash_var:
+ flash_var = self._search_regex([
+ r'flashvars\.%s\s*=\s*"([^"]+)"' % x,
+ r'%s\s*=\s*"([^"]+)"' % x],
+ webpage, x, *args, **kwargs)
+ return flash_var
+
+ hd_doc = self._download_xml(
+ 'http://www.laola1.tv/server/hd_video.php', video_id, query={
+ 'play': get_flashvar('streamid'),
+ 'partner': get_flashvar('partnerid'),
+ 'portal': get_flashvar('portalid'),
+ 'lang': get_flashvar('sprache'),
+ 'v5ident': '',
+ })
+
+ _v = lambda x, **k: xpath_text(hd_doc, './/video/' + x, **k)
+ title = _v('title', fatal=True)
+
+ token_url = None
+ premium = get_flashvar('premium', default=None)
+ if premium:
+ token_url = update_url_query(
+ _v('url', fatal=True), {
+ 'timestamp': get_flashvar('timestamp'),
+ 'auth': get_flashvar('auth'),
+ })
+ else:
+ data_abo = urlencode_postdata(
+ dict((i, v) for i, v in enumerate(_v('req_liga_abos').split(','))))
+ token_url = self._download_json(
+ 'https://club.laola1.tv/sp/laola1/api/v3/user/session/premium/player/stream-access',
+ video_id, query={
+ 'videoId': _v('id'),
+ 'target': self._search_regex(r'vs_target = (\d+);', webpage, 'vs target'),
+ 'label': _v('label'),
+ 'area': _v('area'),
+ }, data=data_abo)['data']['stream-access'][0]
+
+ token_doc = self._download_xml(
+ token_url, video_id, 'Downloading token',
+ headers=self.geo_verification_headers())
+
+ token_attrib = xpath_element(token_doc, './/token').attrib
+
+ if token_attrib['status'] != '0':
+ raise ExtractorError(
+ 'Token error: %s' % token_attrib['comment'], expected=True)
+
+ formats = self._extract_akamai_formats(
+ '%s?hdnea=%s' % (token_attrib['url'], token_attrib['auth']),
+ video_id)
+ self._sort_formats(formats)
+
+ categories_str = _v('meta_sports')
+ categories = categories_str.split(',') if categories_str else []
+ is_live = _v('islive') == 'true'
+
+ return {
+ 'id': video_id,
+ 'title': self._live_title(title) if is_live else title,
+ 'upload_date': unified_strdate(_v('time_date')),
+ 'uploader': _v('meta_organisation'),
+ 'categories': categories,
+ 'is_live': is_live,
+ 'formats': formats,
+ }
+
+
class Laola1TvIE(InfoExtractor):
- _VALID_URL = r'https?://(?:www\.)?laola1\.tv/(?P<lang>[a-z]+)-(?P<portal>[a-z]+)/(?P<kind>[^/]+)/(?P<slug>[^/?#&]+)'
+ IE_NAME = 'laola1tv'
+ _VALID_URL = r'https?://(?:www\.)?laola1\.tv/[a-z]+-[a-z]+/[^/]+/(?P<id>[^/?#&]+)'
_TESTS = [{
'url': 'http://www.laola1.tv/de-de/video/straubing-tigers-koelner-haie/227883.html',
'info_dict': {
}]
def _real_extract(self, url):
- mobj = re.match(self._VALID_URL, url)
- display_id = mobj.group('slug')
- kind = mobj.group('kind')
- lang = mobj.group('lang')
- portal = mobj.group('portal')
+ display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
if 'Dieser Livestream ist bereits beendet.' in webpage:
raise ExtractorError('This live stream has already finished.', expected=True)
- iframe_url = self._search_regex(
+ iframe_url = urljoin(url, self._search_regex(
r'<iframe[^>]*?id="videoplayer"[^>]*?src="([^"]+)"',
- webpage, 'iframe url')
-
- video_id = self._search_regex(
- r'videoid=(\d+)', iframe_url, 'video id')
-
- iframe = self._download_webpage(compat_urlparse.urljoin(
- url, iframe_url), display_id, 'Downloading iframe')
-
- partner_id = self._search_regex(
- r'partnerid\s*:\s*(["\'])(?P<partner_id>.+?)\1',
- iframe, 'partner id', group='partner_id')
-
- hd_doc = self._download_xml(
- 'http://www.laola1.tv/server/hd_video.php?%s'
- % compat_urllib_parse_urlencode({
- 'play': video_id,
- 'partner': partner_id,
- 'portal': portal,
- 'lang': lang,
- 'v5ident': '',
- }), display_id)
-
- _v = lambda x, **k: xpath_text(hd_doc, './/video/' + x, **k)
- title = _v('title', fatal=True)
-
- VS_TARGETS = {
- 'video': '2',
- 'livestream': '17',
- }
-
- req = sanitized_Request(
- 'https://club.laola1.tv/sp/laola1/api/v3/user/session/premium/player/stream-access?%s' %
- compat_urllib_parse_urlencode({
- 'videoId': video_id,
- 'target': VS_TARGETS.get(kind, '2'),
- 'label': _v('label'),
- 'area': _v('area'),
- }),
- urlencode_postdata(
- dict((i, v) for i, v in enumerate(_v('req_liga_abos').split(',')))))
-
- token_url = self._download_json(req, display_id)['data']['stream-access'][0]
- token_doc = self._download_xml(token_url, display_id, 'Downloading token')
-
- token_attrib = xpath_element(token_doc, './/token').attrib
- token_auth = token_attrib['auth']
-
- if token_auth in ('blocked', 'restricted', 'error'):
- raise ExtractorError(
- 'Token error: %s' % token_attrib['comment'], expected=True)
-
- formats = self._extract_f4m_formats(
- '%s?hdnea=%s&hdcore=3.2.0' % (token_attrib['url'], token_auth),
- video_id, f4m_id='hds')
- self._sort_formats(formats)
-
- categories_str = _v('meta_sports')
- categories = categories_str.split(',') if categories_str else []
+ webpage, 'iframe url'))
return {
- 'id': video_id,
+ '_type': 'url',
'display_id': display_id,
- 'title': title,
- 'upload_date': unified_strdate(_v('time_date')),
- 'uploader': _v('meta_organisation'),
- 'categories': categories,
- 'is_live': _v('islive') == 'true',
- 'formats': formats,
+ 'url': iframe_url,
+ 'ie_key': 'Laola1TvEmbed',
}
return formats
def _real_extract(self, url):
- uu_mobj = re.search('uu=([\w]+)', url)
- vu_mobj = re.search('vu=([\w]+)', url)
+ uu_mobj = re.search(r'uu=([\w]+)', url)
+ vu_mobj = re.search(r'vu=([\w]+)', url)
if not uu_mobj or not vu_mobj:
raise ExtractorError('Invalid URL: %s' % url, expected=True)
'id': 'lqm3kl',
'ext': 'mp4',
'title': "Comprendre l'affaire Bygmalion en 5 minutes",
- 'thumbnail': 're:^https?://.*\.jpg',
+ 'thumbnail': r're:^https?://.*\.jpg',
'duration': 320,
'upload_date': '20160119',
'timestamp': 1453194778,
'id': '90716351',
'ext': 'mp4',
'title': "Pa's trip to Mars",
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'duration': 0,
'view_count': int,
},
formats = [{
'url': media_url,
- } for media_url in set(re.findall('var\s+mediaURL(?:Libsyn)?\s*=\s*"([^"]+)"', webpage))]
+ } for media_url in set(re.findall(r'var\s+mediaURL(?:Libsyn)?\s*=\s*"([^"]+)"', webpage))]
podcast_title = self._search_regex(
r'<h2>([^<]+)</h2>', webpage, 'podcast title', default=None)
'id': 'e50c2dec2867350528e2574c899b8291',
'ext': 'mp4',
'title': 'e50c2dec2867350528e2574c899b8291',
- 'thumbnail': 're:http://.*\.jpg',
+ 'thumbnail': r're:http://.*\.jpg',
}
}, {
# with 1080p
format_id = 'rtmp'
if stream.get('videoBitRate'):
format_id += '-%d' % int_or_none(stream['videoBitRate'])
- http_url = 'http://cpl.delvenetworks.com/' + rtmp.group('playpath')[4:]
- urls.append(http_url)
- http_fmt = fmt.copy()
- http_fmt.update({
- 'url': http_url,
- 'format_id': format_id.replace('rtmp', 'http'),
- })
- formats.append(http_fmt)
+ http_format_id = format_id.replace('rtmp', 'http')
+
+ CDN_HOSTS = (
+ ('delvenetworks.com', 'cpl.delvenetworks.com'),
+ ('video.llnw.net', 's2.content.video.llnw.net'),
+ )
+ for cdn_host, http_host in CDN_HOSTS:
+ if cdn_host not in rtmp.group('host').lower():
+ continue
+ http_url = 'http://%s/%s' % (http_host, rtmp.group('playpath')[4:])
+ urls.append(http_url)
+ if self._is_valid_url(http_url, video_id, http_format_id):
+ http_fmt = fmt.copy()
+ http_fmt.update({
+ 'url': http_url,
+ 'format_id': http_format_id,
+ })
+ formats.append(http_fmt)
+ break
+
fmt.update({
'url': rtmp.group('url'),
'play_path': rtmp.group('playpath'),
'ext': 'mp4',
'title': 'HaP and the HB Prince Trailer',
'description': 'md5:8005b944181778e313d95c1237ddb640',
- 'thumbnail': 're:^https?://.*\.jpeg$',
+ 'thumbnail': r're:^https?://.*\.jpeg$',
'duration': 144.23,
'timestamp': 1244136834,
'upload_date': '20090604',
'id': 'a3e00274d4564ec4a9b29b9466432335',
'ext': 'mp4',
'title': '3Play Media Overview Video',
- 'thumbnail': 're:^https?://.*\.jpeg$',
+ 'thumbnail': r're:^https?://.*\.jpeg$',
'duration': 78.101,
'timestamp': 1338929955,
'upload_date': '20120605',
'id': 'VOD00041610',
'ext': 'mp4',
'title': '花千骨第1集',
- 'thumbnail': 're:https?://.*\.jpg$',
+ 'thumbnail': r're:https?://.*\.jpg$',
'description': 'md5:c7017aa144c87467c4fb2909c4b05d6f',
'episode_number': 1,
},
webpage = self._download_webpage(url, video_id)
program_info = self._parse_json(self._search_regex(
- 'var\s+programInfo\s*=\s*([^;]+)', webpage, 'VOD data', default='{}'),
+ r'var\s+programInfo\s*=\s*([^;]+)', webpage, 'VOD data', default='{}'),
video_id)
season_list = list(program_info.get('seasonList', {}).values())
'description': 'extremely bad day for this guy..!',
'uploader': 'ljfriel2',
'title': 'Most unlucky car accident',
- 'thumbnail': 're:^https?://.*\.jpg$'
+ 'thumbnail': r're:^https?://.*\.jpg$'
}
}, {
'url': 'http://www.liveleak.com/view?i=f93_1390833151',
'description': 'German Television Channel NDR does an exclusive interview with Edward Snowden.\r\nUploaded on LiveLeak cause German Television thinks the rest of the world isn\'t intereseted in Edward Snowden.',
'uploader': 'ARD_Stinkt',
'title': 'German Television does first Edward Snowden Interview (ENGLISH)',
- 'thumbnail': 're:^https?://.*\.jpg$'
+ 'thumbnail': r're:^https?://.*\.jpg$'
}
}, {
'url': 'http://www.liveleak.com/view?i=4f7_1392687779',
'description': 'Happened on 27.7.2014. \r\nAt 0:53 you can see people still swimming at near beach.',
'uploader': 'bony333',
'title': 'Crazy Hungarian tourist films close call waterspout in Croatia',
- 'thumbnail': 're:^https?://.*\.jpg$'
+ 'thumbnail': r're:^https?://.*\.jpg$'
}
}, {
# Covers https://github.com/rg3/youtube-dl/pull/10664#issuecomment-247439521
'duration': 5968.0,
'like_count': int,
'view_count': int,
- 'thumbnail': 're:^http://.*\.jpg$'
+ 'thumbnail': r're:^http://.*\.jpg$'
}
}, {
'url': 'http://new.livestream.com/tedx/cityenglish',
'description': 'md5:d82a5e36b775b7048617f263a0e3475e',
'age_limit': 7,
'duration': 3019,
- 'thumbnail': 're:^https?://.*\.jpg$'
+ 'thumbnail': r're:^https?://.*\.jpg$'
},
'params': {
'skip_download': True, # HLS download
'description': 'md5:7352d113a242a808676ff17e69db6a69',
'age_limit': 18,
'duration': 346,
- 'thumbnail': 're:^https?://.*\.jpg$'
+ 'thumbnail': r're:^https?://.*\.jpg$'
},
'params': {
'skip_download': True, # HLS download
# Already logged in
if any(re.search(p, signin_page) for p in (
- 'isLoggedIn\s*:\s*true', r'logout\.aspx', r'>Log out<')):
+ r'isLoggedIn\s*:\s*true', r'logout\.aspx', r'>Log out<')):
return
# Step 2: submit email
'info_dict': {
'id': 'matchtv-live',
'ext': 'flv',
- 'title': 're:^Матч ТВ - Прямой эфир \d{4}-\d{2}-\d{2} \d{2}:\d{2}$',
+ 'title': r're:^Матч ТВ - Прямой эфир \d{4}-\d{2}-\d{2} \d{2}:\d{2}$',
'is_live': True,
},
'params': {
data_url = self._search_regex(
r'(?:dataURL|playerXml(?:["\'])?)\s*:\s*(["\'])(?P<url>.+/(?:video|audio)-?[0-9]+-avCustom\.xml)\1',
- webpage, 'data url', group='url').replace('\/', '/')
+ webpage, 'data url', group='url').replace(r'\/', '/')
doc = self._download_xml(
compat_urlparse.urljoin(url, data_url), video_id)
--- /dev/null
+# coding: utf-8
+from __future__ import unicode_literals
+
+from .common import InfoExtractor
+from ..utils import (
+ int_or_none,
+ parse_duration,
+ unified_timestamp,
+)
+
+
+class MeipaiIE(InfoExtractor):
+ IE_DESC = '美拍'
+ _VALID_URL = r'https?://(?:www\.)?meipai.com/media/(?P<id>[0-9]+)'
+ _TESTS = [{
+ # regular uploaded video
+ 'url': 'http://www.meipai.com/media/531697625',
+ 'md5': 'e3e9600f9e55a302daecc90825854b4f',
+ 'info_dict': {
+ 'id': '531697625',
+ 'ext': 'mp4',
+ 'title': '#葉子##阿桑##余姿昀##超級女聲#',
+ 'description': '#葉子##阿桑##余姿昀##超級女聲#',
+ 'thumbnail': r're:^https?://.*\.jpg$',
+ 'duration': 152,
+ 'timestamp': 1465492420,
+ 'upload_date': '20160609',
+ 'view_count': 35511,
+ 'creator': '她她-TATA',
+ 'tags': ['葉子', '阿桑', '余姿昀', '超級女聲'],
+ }
+ }, {
+ # record of live streaming
+ 'url': 'http://www.meipai.com/media/585526361',
+ 'md5': 'ff7d6afdbc6143342408223d4f5fb99a',
+ 'info_dict': {
+ 'id': '585526361',
+ 'ext': 'mp4',
+ 'title': '姿昀和善願 練歌練琴啦😁😁😁',
+ 'description': '姿昀和善願 練歌練琴啦😁😁😁',
+ 'thumbnail': r're:^https?://.*\.jpg$',
+ 'duration': 5975,
+ 'timestamp': 1474311799,
+ 'upload_date': '20160919',
+ 'view_count': 1215,
+ 'creator': '她她-TATA',
+ }
+ }]
+
+ def _real_extract(self, url):
+ video_id = self._match_id(url)
+ webpage = self._download_webpage(url, video_id)
+
+ title = self._og_search_title(
+ webpage, default=None) or self._html_search_regex(
+ r'<title[^>]*>([^<]+)</title>', webpage, 'title')
+
+ formats = []
+
+ # recorded playback of live streaming
+ m3u8_url = self._html_search_regex(
+ r'file:\s*encodeURIComponent\((["\'])(?P<url>(?:(?!\1).)+)\1\)',
+ webpage, 'm3u8 url', group='url', default=None)
+ if m3u8_url:
+ formats.extend(self._extract_m3u8_formats(
+ m3u8_url, video_id, 'mp4', entry_protocol='m3u8_native',
+ m3u8_id='hls', fatal=False))
+
+ if not formats:
+ # regular uploaded video
+ video_url = self._search_regex(
+ r'data-video=(["\'])(?P<url>(?:(?!\1).)+)\1', webpage, 'video url',
+ group='url', default=None)
+ if video_url:
+ formats.append({
+ 'url': video_url,
+ 'format_id': 'http',
+ })
+
+ timestamp = unified_timestamp(self._og_search_property(
+ 'video:release_date', webpage, 'release date', fatal=False))
+
+ tags = self._og_search_property(
+ 'video:tag', webpage, 'tags', default='').split(',')
+
+ view_count = int_or_none(self._html_search_meta(
+ 'interactionCount', webpage, 'view count'))
+ duration = parse_duration(self._html_search_meta(
+ 'duration', webpage, 'duration'))
+ creator = self._og_search_property(
+ 'video:director', webpage, 'creator', fatal=False)
+
+ return {
+ 'id': video_id,
+ 'title': title,
+ 'description': self._og_search_description(webpage),
+ 'thumbnail': self._og_search_thumbnail(webpage),
+ 'duration': duration,
+ 'timestamp': timestamp,
+ 'view_count': view_count,
+ 'creator': creator,
+ 'tags': tags,
+ 'formats': formats,
+ }
--- /dev/null
+# coding: utf-8
+from __future__ import unicode_literals
+
+from .common import InfoExtractor
+from ..utils import (
+ int_or_none,
+ urljoin,
+)
+
+
+class MelonVODIE(InfoExtractor):
+ _VALID_URL = r'https?://vod\.melon\.com/video/detail2\.html?\?.*?mvId=(?P<id>[0-9]+)'
+ _TEST = {
+ 'url': 'http://vod.melon.com/video/detail2.htm?mvId=50158734',
+ 'info_dict': {
+ 'id': '50158734',
+ 'ext': 'mp4',
+ 'title': "Jessica 'Wonderland' MV Making Film",
+ 'thumbnail': r're:^https?://.*\.jpg$',
+ 'artist': 'Jessica (제시카)',
+ 'upload_date': '20161212',
+ 'duration': 203,
+ },
+ 'params': {
+ 'skip_download': 'm3u8 download',
+ }
+ }
+
+ def _real_extract(self, url):
+ video_id = self._match_id(url)
+
+ play_info = self._download_json(
+ 'http://vod.melon.com/video/playerInfo.json', video_id,
+ note='Downloading player info JSON', query={'mvId': video_id})
+
+ title = play_info['mvInfo']['MVTITLE']
+
+ info = self._download_json(
+ 'http://vod.melon.com/delivery/streamingInfo.json', video_id,
+ note='Downloading streaming info JSON',
+ query={
+ 'contsId': video_id,
+ 'contsType': 'VIDEO',
+ })
+
+ stream_info = info['streamingInfo']
+
+ formats = self._extract_m3u8_formats(
+ stream_info['encUrl'], video_id, 'mp4', m3u8_id='hls')
+ self._sort_formats(formats)
+
+ artist_list = play_info.get('artistList')
+ artist = None
+ if isinstance(artist_list, list):
+ artist = ', '.join(
+ [a['ARTISTNAMEWEBLIST']
+ for a in artist_list if a.get('ARTISTNAMEWEBLIST')])
+
+ thumbnail = urljoin(info.get('staticDomain'), stream_info.get('imgPath'))
+
+ duration = int_or_none(stream_info.get('playTime'))
+ upload_date = stream_info.get('mvSvcOpenDt', '')[:8] or None
+
+ return {
+ 'id': video_id,
+ 'title': title,
+ 'artist': artist,
+ 'thumbnail': thumbnail,
+ 'upload_date': upload_date,
+ 'duration': duration,
+ 'formats': formats
+ }
video_id, display_id = re.match(self._VALID_URL, url).groups()
# the video may come from an external site
- m_external = re.match('^(\w{2})-(.*)$', video_id)
+ m_external = re.match(r'^(\w{2})-(.*)$', video_id)
if m_external is not None:
prefix, ext_id = m_external.groups()
# Check if video comes from YouTube
'upload_date': '20131220',
'ext': 'mp4',
'title': 'md5:543aa4c27a4931d371c3f433e8cebebc',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
}
},
{
'title': '我是歌手第四季双年巅峰会:韩红李玟“双王”领军对抗',
'description': '我是歌手第四季双年巅峰会',
'duration': 7461,
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
},
}, {
# no tbr extracted from stream_url
'id': '125848331',
'ext': 'mp4',
'title': 'youtube-dl test video',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'filesize_approx': 1530000,
'duration': 9,
'view_count': int,
'id': '3453494717001',
'ext': 'mp4',
'title': 'The Gospel by Numbers',
- 'thumbnail': 're:^https?://.*\.jpg',
+ 'thumbnail': r're:^https?://.*\.jpg',
'upload_date': '20140410',
'description': 'Coming soon from T4G 2014!',
'uploader_id': '2034960640001',
'season_id': 'diario_de_t14_11981',
'episode': 'Programa 144',
'episode_number': 3,
- 'thumbnail': 're:(?i)^https?://.*\.jpg$',
+ 'thumbnail': r're:(?i)^https?://.*\.jpg$',
'duration': 2913,
},
'add_ie': ['Ooyala'],
'season_id': 'cuarto_milenio_t06_12715',
'episode': 'Programa 226',
'episode_number': 24,
- 'thumbnail': 're:(?i)^https?://.*\.jpg$',
+ 'thumbnail': r're:(?i)^https?://.*\.jpg$',
'duration': 7313,
},
'params': {
return {
'_type': 'url_transparent',
# for some reason only HLS is supported
- 'url': smuggle_url('ooyala:' + embedCode, {'supportedformats': 'm3u8'}),
+ 'url': smuggle_url('ooyala:' + embedCode, {'supportedformats': 'm3u8,dash'}),
'id': video_id,
'title': title,
'description': description,
clean_html,
ExtractorError,
OnDemandPagedList,
- parse_count,
str_to_int,
)
class MixcloudIE(InfoExtractor):
- _VALID_URL = r'^(?:https?://)?(?:www\.)?mixcloud\.com/([^/]+)/(?!stream|uploads|favorites|listens|playlists)([^/]+)'
+ _VALID_URL = r'https?://(?:(?:www|beta|m)\.)?mixcloud\.com/([^/]+)/(?!stream|uploads|favorites|listens|playlists)([^/]+)'
IE_NAME = 'mixcloud'
_TESTS = [{
'description': 'After quite a long silence from myself, finally another Drum\'n\'Bass mix with my favourite current dance floor bangers.',
'uploader': 'Daniel Holbach',
'uploader_id': 'dholbach',
- 'thumbnail': 're:https?://.*\.jpg',
+ 'thumbnail': r're:https?://.*\.jpg',
'view_count': int,
- 'like_count': int,
},
}, {
'url': 'http://www.mixcloud.com/gillespeterson/caribou-7-inch-vinyl-mix-chat/',
'uploader_id': 'gillespeterson',
'thumbnail': 're:https?://.*',
'view_count': int,
- 'like_count': int,
},
+ }, {
+ 'url': 'https://beta.mixcloud.com/RedLightRadio/nosedrip-15-red-light-radio-01-18-2016/',
+ 'only_matching': True,
}]
# See https://www.mixcloud.com/media/js2/www_js_2.9e23256562c080482435196ca3975ab5.js
song_url = play_info['stream_url']
- PREFIX = (
- r'm-play-on-spacebar[^>]+'
- r'(?:\s+[a-zA-Z0-9-]+(?:="[^"]+")?)*?\s+')
- title = self._html_search_regex(
- PREFIX + r'm-title="([^"]+)"', webpage, 'title')
+ title = self._html_search_regex(r'm-title="([^"]+)"', webpage, 'title')
thumbnail = self._proto_relative_url(self._html_search_regex(
- PREFIX + r'm-thumbnail-url="([^"]+)"', webpage, 'thumbnail',
- fatal=False))
+ r'm-thumbnail-url="([^"]+)"', webpage, 'thumbnail', fatal=False))
uploader = self._html_search_regex(
- PREFIX + r'm-owner-name="([^"]+)"',
- webpage, 'uploader', fatal=False)
+ r'm-owner-name="([^"]+)"', webpage, 'uploader', fatal=False)
uploader_id = self._search_regex(
r'\s+"profile": "([^"]+)",', webpage, 'uploader id', fatal=False)
description = self._og_search_description(webpage)
- like_count = parse_count(self._search_regex(
- r'\bbutton-favorite[^>]+>.*?<span[^>]+class=["\']toggle-number[^>]+>\s*([^<]+)',
- webpage, 'like count', default=None))
view_count = str_to_int(self._search_regex(
[r'<meta itemprop="interactionCount" content="UserPlays:([0-9]+)"',
- r'/listeners/?">([0-9,.]+)</a>'],
+ r'/listeners/?">([0-9,.]+)</a>',
+ r'm-tooltip=["\']([\d,.]+) plays'],
webpage, 'play count', default=None))
return {
'uploader': uploader,
'uploader_id': uploader_id,
'view_count': view_count,
- 'like_count': like_count,
}
'duration': 66,
'timestamp': 1405980600,
'upload_date': '20140721',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
},
},
{
'duration': 46,
'timestamp': 1405105800,
'upload_date': '20140711',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
},
},
{
'duration': 488,
'timestamp': 1405399936,
'upload_date': '20140715',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
},
},
{
'duration': 52,
'timestamp': 1405390722,
'upload_date': '20140715',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
},
},
{
'timestamp': 1451564040,
'age_limit': 0,
'thumbnails': 'mincount:5',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'ext': 'flv',
},
'params': {
'ext': 'flv',
'title': 'Sink cut out machine',
'description': 'md5:f29ff97b663aefa760bf7ca63c8ca8a8',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'width': 540,
'height': 360,
'duration': 179,
'ext': 'flv',
'title': 'Operacion Condor.',
'description': 'md5:7e68cb2fcda66833d5081c542491a9a3',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'width': 480,
'height': 296,
'duration': 6027,
'display_id': 'amateur-teen-playing-and-masturbating-318131',
'ext': 'mp4',
'title': 'amateur teen playing and masturbating',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'upload_date': '20121114',
'view_count': int,
'like_count': int,
'display_id': 'v-avtu-pred-mano-rdecelaska-alfi-nipic',
'ext': 'mp4',
'title': 'V avtu pred mano rdečelaska - Alfi Nipič',
- 'thumbnail': 're:^http://.*\.jpg$',
+ 'thumbnail': r're:^http://.*\.jpg$',
'duration': 242,
}
}
'categories': ['Gaming', 'anal', 'reluctant', 'rough', 'Wife'],
'upload_date': '20100913',
'uploader_id': 'famouslyfuckedup',
- 'thumbnail': 're:http://.*\.jpg',
+ 'thumbnail': r're:http://.*\.jpg',
'age_limit': 18,
}
}, {
'game', 'hairy'],
'upload_date': '20140622',
'uploader_id': 'Sulivana7x',
- 'thumbnail': 're:http://.*\.jpg',
+ 'thumbnail': r're:http://.*\.jpg',
'age_limit': 18,
},
'skip': '404',
'categories': ['superheroine heroine superher'],
'upload_date': '20140827',
'uploader_id': 'shade0230',
- 'thumbnail': 're:http://.*\.jpg',
+ 'thumbnail': r're:http://.*\.jpg',
'age_limit': 18,
}
}, {
'ext': 'mp4',
'title': 'Warcraft Trailer 1',
'description': 'Watch Trailer 1 from Warcraft (2016). Legendary’s WARCRAFT is a 3D epic adventure of world-colliding conflict based.',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'timestamp': 1446843055,
'upload_date': '20151106',
'uploader': 'Movieclips',
'ext': 'mp4',
'title': 'Oculus - Trailer 1',
'description': 'md5:40cc6790fc81d931850ca9249b40e8a4',
- 'thumbnail': 're:http://.*\.jpg',
+ 'thumbnail': r're:http://.*\.jpg',
},
}
'title': 'SHETLAND WOOL',
'description': 'md5:c5afca6871ad59b4271e7704fe50ab04',
'duration': 900,
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
},
}
m3u8_formats = self._extract_m3u8_formats(
format_url, display_id, 'mp4',
m3u8_id='hls', fatal=False)
- # Despite metadata in m3u8 all video+audio formats are
- # actually video-only (no audio)
- for f in m3u8_formats:
- if f.get('acodec') != 'none' and f.get('vcodec') != 'none':
- f['acodec'] = 'none'
formats.extend(m3u8_formats)
else:
formats.append({
fix_xml_ampersands,
float_or_none,
HEADRequest,
- NO_DEFAULT,
RegexNotFoundError,
sanitized_Request,
strip_or_none,
timeconvert,
+ try_get,
unescapeHTML,
update_url_query,
url_basename,
# Remove the templates, like &device={device}
return re.sub(r'&[^=]*?={.*?}(?=(&|$))', '', url)
- # This was originally implemented for ComedyCentral, but it also works here
- @classmethod
- def _transform_rtmp_url(cls, rtmp_video_url):
- m = re.match(r'^rtmpe?://.*?/(?P<finalid>gsp\..+?/.*)$', rtmp_video_url)
- if not m:
- return {'rtmp': rtmp_video_url}
- base = 'http://viacommtvstrmfs.fplive.net/'
- return {'http': base + m.group('finalid')}
-
def _get_feed_url(self, uri):
return self._FEED_URL
url = re.sub(r'.+pxE=mp4', 'http://mtvnmobile.vo.llnwd.net/kip0/_pxn=0+_pxK=18639+_pxE=mp4', url, 1)
return [{'url': url, 'ext': 'mp4'}]
- def _extract_video_formats(self, mdoc, mtvn_id):
+ def _extract_video_formats(self, mdoc, mtvn_id, video_id):
if re.match(r'.*/(error_country_block\.swf|geoblock\.mp4|copyright_error\.flv(?:\?geo\b.+?)?)$', mdoc.find('.//src').text) is not None:
if mtvn_id is not None and self._MOBILE_TEMPLATE is not None:
self.to_screen('The normal version is not available from your '
formats = []
for rendition in mdoc.findall('.//rendition'):
- try:
- _, _, ext = rendition.attrib['type'].partition('/')
- rtmp_video_url = rendition.find('./src').text
- if rtmp_video_url.endswith('siteunavail.png'):
- continue
- new_urls = self._transform_rtmp_url(rtmp_video_url)
- formats.extend([{
- 'ext': 'flv' if new_url.startswith('rtmp') else ext,
- 'url': new_url,
- 'format_id': '-'.join(filter(None, [kind, rendition.get('bitrate')])),
- 'width': int(rendition.get('width')),
- 'height': int(rendition.get('height')),
- } for kind, new_url in new_urls.items()])
- except (KeyError, TypeError):
- raise ExtractorError('Invalid rendition field.')
+ if rendition.get('method') == 'hls':
+ hls_url = rendition.find('./src').text
+ formats.extend(self._extract_m3u8_formats(
+ hls_url, video_id, ext='mp4', entry_protocol='m3u8_native',
+ m3u8_id='hls'))
+ else:
+ # fms
+ try:
+ _, _, ext = rendition.attrib['type'].partition('/')
+ rtmp_video_url = rendition.find('./src').text
+ if 'error_not_available.swf' in rtmp_video_url:
+ raise ExtractorError(
+ '%s said: video is not available' % self.IE_NAME,
+ expected=True)
+ if rtmp_video_url.endswith('siteunavail.png'):
+ continue
+ formats.extend([{
+ 'ext': 'flv' if rtmp_video_url.startswith('rtmp') else ext,
+ 'url': rtmp_video_url,
+ 'format_id': '-'.join(filter(None, [
+ 'rtmp' if rtmp_video_url.startswith('rtmp') else None,
+ rendition.get('bitrate')])),
+ 'width': int(rendition.get('width')),
+ 'height': int(rendition.get('height')),
+ }])
+ except (KeyError, TypeError):
+ raise ExtractorError('Invalid rendition field.')
self._sort_formats(formats)
return formats
} for typographic in transcript.findall('./typographic')]
return subtitles
- def _get_video_info(self, itemdoc):
+ def _get_video_info(self, itemdoc, use_hls=True):
uri = itemdoc.find('guid').text
video_id = self._id_from_uri(uri)
self.report_extraction(video_id)
content_el = itemdoc.find('%s/%s' % (_media_xml_tag('group'), _media_xml_tag('content')))
mediagen_url = self._remove_template_parameter(content_el.attrib['url'])
+ mediagen_url = mediagen_url.replace('device={device}', '')
if 'acceptMethods' not in mediagen_url:
mediagen_url += '&' if '?' in mediagen_url else '?'
- mediagen_url += 'acceptMethods=fms'
+ mediagen_url += 'acceptMethods='
+ mediagen_url += 'hls' if use_hls else 'fms'
mediagen_doc = self._download_xml(mediagen_url, video_id,
'Downloading video urls')
if mtvn_id_node is not None:
mtvn_id = mtvn_id_node.text
+ formats = self._extract_video_formats(mediagen_doc, mtvn_id, video_id)
+
return {
'title': title,
- 'formats': self._extract_video_formats(mediagen_doc, mtvn_id),
+ 'formats': formats,
'subtitles': self._extract_subtitles(mediagen_doc, mtvn_id),
'id': video_id,
'thumbnail': self._get_thumbnail_url(uri, itemdoc),
data['lang'] = self._LANG
return data
- def _get_videos_info(self, uri):
+ def _get_videos_info(self, uri, use_hls=True):
video_id = self._id_from_uri(uri)
feed_url = self._get_feed_url(uri)
info_url = update_url_query(feed_url, self._get_feed_query(uri))
- return self._get_videos_info_from_url(info_url, video_id)
+ return self._get_videos_info_from_url(info_url, video_id, use_hls)
- def _get_videos_info_from_url(self, url, video_id):
+ def _get_videos_info_from_url(self, url, video_id, use_hls=True):
idoc = self._download_xml(
url, video_id,
'Downloading info', transform_source=fix_xml_ampersands)
description = xpath_text(idoc, './channel/description')
return self.playlist_result(
- [self._get_video_info(item) for item in idoc.findall('.//item')],
+ [self._get_video_info(item, use_hls) for item in idoc.findall('.//item')],
playlist_title=title, playlist_description=description)
- def _extract_mgid(self, webpage, default=NO_DEFAULT):
+ def _extract_triforce_mgid(self, webpage, data_zone=None, video_id=None):
+ triforce_feed = self._parse_json(self._search_regex(
+ r'triforceManifestFeed\s*=\s*({.+?})\s*;\s*\n', webpage,
+ 'triforce feed', default='{}'), video_id, fatal=False)
+
+ data_zone = self._search_regex(
+ r'data-zone=(["\'])(?P<zone>.+?_lc_promo.*?)\1', webpage,
+ 'data zone', default=data_zone, group='zone')
+
+ feed_url = try_get(
+ triforce_feed, lambda x: x['manifest']['zones'][data_zone]['feed'],
+ compat_str)
+ if not feed_url:
+ return
+
+ feed = self._download_json(feed_url, video_id, fatal=False)
+ if not feed:
+ return
+
+ return try_get(feed, lambda x: x['result']['data']['id'], compat_str)
+
+ def _extract_mgid(self, webpage):
try:
# the url can be http://media.mtvnservices.com/fb/{mgid}.swf
# or http://media.mtvnservices.com/{mgid}
sm4_embed = self._html_search_meta(
'sm4:video:embed', webpage, 'sm4 embed', default='')
mgid = self._search_regex(
- r'embed/(mgid:.+?)["\'&?/]', sm4_embed, 'mgid', default=default)
+ r'embed/(mgid:.+?)["\'&?/]', sm4_embed, 'mgid', default=None)
+
+ if not mgid:
+ mgid = self._extract_triforce_mgid(webpage)
+
return mgid
def _real_extract(self, url):
class MTVIE(MTVServicesInfoExtractor):
IE_NAME = 'mtv'
- _VALID_URL = r'https?://(?:www\.)?mtv\.com/(?:video-clips|full-episodes)/(?P<id>[^/?#.]+)'
+ _VALID_URL = r'https?://(?:www\.)?mtv\.com/(?:video-clips|(?:full-)?episodes)/(?P<id>[^/?#.]+)'
_FEED_URL = 'http://www.mtv.com/feeds/mrss/'
_TESTS = [{
}, {
'url': 'http://www.mtv.com/full-episodes/94tujl/unlocking-the-truth-gates-of-hell-season-1-ep-101',
'only_matching': True,
+ }, {
+ 'url': 'http://www.mtv.com/episodes/g8xu7q/teen-mom-2-breaking-the-wall-season-7-ep-713',
+ 'only_matching': True,
}]
+class MTV81IE(InfoExtractor):
+ IE_NAME = 'mtv81'
+ _VALID_URL = r'https?://(?:www\.)?mtv81\.com/videos/(?P<id>[^/?#.]+)'
+
+ _TEST = {
+ 'url': 'http://www.mtv81.com/videos/artist-to-watch/the-godfather-of-japanese-hip-hop-segment-1/',
+ 'md5': '1edbcdf1e7628e414a8c5dcebca3d32b',
+ 'info_dict': {
+ 'id': '5e14040d-18a4-47c4-a582-43ff602de88e',
+ 'ext': 'mp4',
+ 'title': 'Unlocking The Truth|July 18, 2016|1|101|Trailer',
+ 'description': '"Unlocking the Truth" premieres August 17th at 11/10c.',
+ 'timestamp': 1468846800,
+ 'upload_date': '20160718',
+ },
+ }
+
+ def _extract_mgid(self, webpage):
+ return self._search_regex(
+ r'getTheVideo\((["\'])(?P<id>mgid:.+?)\1', webpage,
+ 'mgid', group='id')
+
+ def _real_extract(self, url):
+ video_id = self._match_id(url)
+ webpage = self._download_webpage(url, video_id)
+ mgid = self._extract_mgid(webpage)
+ return self.url_result('http://media.mtvnservices.com/embed/%s' % mgid)
+
+
class MTVVideoIE(MTVServicesInfoExtractor):
IE_NAME = 'mtv:video'
_VALID_URL = r'''(?x)^https?://
'ext': 'mp4',
'title': 're:^münchen.tv-Livestream [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}$',
'is_live': True,
- 'thumbnail': 're:^https?://.*\.jpg$'
+ 'thumbnail': r're:^https?://.*\.jpg$'
},
'params': {
'skip_download': True,
'id': '168859',
'ext': 'flv',
'title': '[M COUNTDOWN] SISTAR - SHAKE IT',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'uploader': 'M COUNTDOWN',
'duration': 206,
'view_count': int,
'id': '173294',
'ext': 'flv',
'title': '[MEET&GREET] Park BoRam',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'uploader': 'Mwave',
'duration': 3634,
'view_count': int,
_TESTS = [
{
'url': 'https://myspace.com/fiveminutestothestage/video/little-big-town/109594919',
+ 'md5': '9c1483c106f4a695c47d2911feed50a7',
'info_dict': {
'id': '109594919',
- 'ext': 'flv',
+ 'ext': 'mp4',
'title': 'Little Big Town',
'description': 'This country quartet was all smiles while playing a sold out show at the Pacific Amphitheatre in Orange County, California.',
'uploader': 'Five Minutes to the Stage',
'timestamp': 1414108751,
'upload_date': '20141023',
},
- 'params': {
- # rtmp download
- 'skip_download': True,
- },
},
# songs
{
'url': 'https://myspace.com/killsorrow/music/song/of-weakened-soul...-93388656-103880681',
+ 'md5': '1d7ee4604a3da226dd69a123f748b262',
'info_dict': {
'id': '93388656',
- 'ext': 'flv',
+ 'ext': 'm4a',
'title': 'Of weakened soul...',
'uploader': 'Killsorrow',
'uploader_id': 'killsorrow',
},
- 'params': {
- # rtmp download
- 'skip_download': True,
- },
}, {
- 'add_ie': ['Vevo'],
+ 'add_ie': ['Youtube'],
'url': 'https://myspace.com/threedaysgrace/music/song/animal-i-have-become-28400208-28218041',
'info_dict': {
- 'id': 'USZM20600099',
- 'ext': 'mp4',
- 'title': 'Animal I Have Become',
- 'uploader': 'Three Days Grace',
- 'timestamp': int,
- 'upload_date': '20060502',
+ 'id': 'xqds0B_meys',
+ 'ext': 'webm',
+ 'title': 'Three Days Grace - Animal I Have Become',
+ 'description': 'md5:8bd86b3693e72a077cf863a8530c54bb',
+ 'uploader': 'ThreeDaysGraceVEVO',
+ 'uploader_id': 'ThreeDaysGraceVEVO',
+ 'upload_date': '20091002',
},
- 'skip': 'VEVO is only available in some countries',
}, {
'add_ie': ['Youtube'],
'url': 'https://myspace.com/starset2/music/song/first-light-95799905-106964426',
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
+ is_song = mobj.group('mediatype').startswith('music/song')
webpage = self._download_webpage(url, video_id)
player_url = self._search_regex(
- r'playerSwf":"([^"?]*)', webpage, 'player URL')
+ r'videoSwf":"([^"?]*)', webpage, 'player URL', fatal=False)
- def rtmp_format_from_stream_url(stream_url, width=None, height=None):
- rtmp_url, play_path = stream_url.split(';', 1)
- return {
- 'format_id': 'rtmp',
- 'url': rtmp_url,
- 'play_path': play_path,
- 'player_url': player_url,
- 'protocol': 'rtmp',
- 'ext': 'flv',
- 'width': width,
- 'height': height,
- }
+ def formats_from_stream_urls(stream_url, hls_stream_url, http_stream_url, width=None, height=None):
+ formats = []
+ vcodec = 'none' if is_song else None
+ if hls_stream_url:
+ formats.append({
+ 'format_id': 'hls',
+ 'url': hls_stream_url,
+ 'protocol': 'm3u8_native',
+ 'ext': 'm4a' if is_song else 'mp4',
+ 'vcodec': vcodec,
+ })
+ if stream_url and player_url:
+ rtmp_url, play_path = stream_url.split(';', 1)
+ formats.append({
+ 'format_id': 'rtmp',
+ 'url': rtmp_url,
+ 'play_path': play_path,
+ 'player_url': player_url,
+ 'protocol': 'rtmp',
+ 'ext': 'flv',
+ 'width': width,
+ 'height': height,
+ 'vcodec': vcodec,
+ })
+ if http_stream_url:
+ formats.append({
+ 'format_id': 'http',
+ 'url': http_stream_url,
+ 'width': width,
+ 'height': height,
+ 'vcodec': vcodec,
+ })
+ return formats
- if mobj.group('mediatype').startswith('music/song'):
+ if is_song:
# songs don't store any useful info in the 'context' variable
song_data = self._search_regex(
r'''<button.*data-song-id=(["\'])%s\1.*''' % video_id,
return self._search_regex(
r'''data-%s=([\'"])(?P<data>.*?)\1''' % name,
song_data, name, default='', group='data')
- stream_url = search_data('stream-url')
- if not stream_url:
+ formats = formats_from_stream_urls(
+ search_data('stream-url'), search_data('hls-stream-url'),
+ search_data('http-stream-url'))
+ if not formats:
vevo_id = search_data('vevo-id')
youtube_id = search_data('youtube-id')
if vevo_id:
else:
raise ExtractorError(
'Found song but don\'t know how to download it')
+ self._sort_formats(formats)
return {
'id': video_id,
'title': self._og_search_title(webpage),
'uploader_id': search_data('artist-username'),
'thumbnail': self._og_search_thumbnail(webpage),
'duration': int_or_none(search_data('duration')),
- 'formats': [rtmp_format_from_stream_url(stream_url)]
+ 'formats': formats,
}
else:
video = self._parse_json(self._search_regex(
r'context = ({.*?});', webpage, 'context'),
video_id)['video']
- formats = []
- hls_stream_url = video.get('hlsStreamUrl')
- if hls_stream_url:
- formats.append({
- 'format_id': 'hls',
- 'url': hls_stream_url,
- 'protocol': 'm3u8_native',
- 'ext': 'mp4',
- })
- stream_url = video.get('streamUrl')
- if stream_url:
- formats.append(rtmp_format_from_stream_url(
- stream_url,
- int_or_none(video.get('width')),
- int_or_none(video.get('height'))))
+ formats = formats_from_stream_urls(
+ video.get('streamUrl'), video.get('hlsStreamUrl'),
+ video.get('mp4StreamUrl'), int_or_none(video.get('width')),
+ int_or_none(video.get('height')))
self._sort_formats(formats)
return {
'id': video_id,
'id': 'f16b2bbd-cde8-481c-a981-7cd48605df43',
'ext': 'mp4',
'title': 'хозяин жизни',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'duration': 25,
},
}, {
else:
video_playpath = ''
- video_swfobj = self._search_regex('swfobject.embedSWF\(\'(.+?)\'', webpage, 'swfobj')
+ video_swfobj = self._search_regex(r'swfobject.embedSWF\(\'(.+?)\'', webpage, 'swfobj')
video_swfobj = compat_urllib_parse_unquote(video_swfobj)
video_title = self._html_search_regex("<h1(?: class='globalHd')?>(.*?)</h1>",
class NaverIE(InfoExtractor):
- _VALID_URL = r'https?://(?:m\.)?tvcast\.naver\.com/v/(?P<id>\d+)'
+ _VALID_URL = r'https?://(?:m\.)?tv(?:cast)?\.naver\.com/v/(?P<id>\d+)'
_TESTS = [{
- 'url': 'http://tvcast.naver.com/v/81652',
+ 'url': 'http://tv.naver.com/v/81652',
'info_dict': {
'id': '81652',
'ext': 'mp4',
'upload_date': '20130903',
},
}, {
- 'url': 'http://tvcast.naver.com/v/395837',
+ 'url': 'http://tv.naver.com/v/395837',
'md5': '638ed4c12012c458fefcddfd01f173cd',
'info_dict': {
'id': '395837',
'upload_date': '20150519',
},
'skip': 'Georestricted',
+ }, {
+ 'url': 'http://tvcast.naver.com/v/81652',
+ 'only_matching': True,
}]
def _real_extract(self, url):
lowercase_escape,
smuggle_url,
unescapeHTML,
+ update_url_query,
)
'url': 'http://www.nbcnews.com/watch/nbcnews-com/how-twitter-reacted-to-the-snowden-interview-269389891880',
'md5': 'af1adfa51312291a017720403826bb64',
'info_dict': {
- 'id': '269389891880',
+ 'id': 'p_tweet_snow_140529',
'ext': 'mp4',
'title': 'How Twitter Reacted To The Snowden Interview',
'description': 'md5:65a0bd5d76fe114f3c2727aa3a81fe64',
'url': 'http://www.nbcnews.com/nightly-news/video/nightly-news-with-brian-williams-full-broadcast-february-4-394064451844',
'md5': '73135a2e0ef819107bbb55a5a9b2a802',
'info_dict': {
- 'id': '394064451844',
+ 'id': 'nn_netcast_150204',
'ext': 'mp4',
'title': 'Nightly News with Brian Williams Full Broadcast (February 4)',
'description': 'md5:1c10c1eccbe84a26e5debb4381e2d3c5',
'url': 'http://www.nbcnews.com/business/autos/volkswagen-11-million-vehicles-could-have-suspect-software-emissions-scandal-n431456',
'md5': 'a49e173825e5fcd15c13fc297fced39d',
'info_dict': {
- 'id': '529953347624',
+ 'id': 'x_lon_vwhorn_150922',
'ext': 'mp4',
'title': 'Volkswagen U.S. Chief:\xa0 We Have Totally Screwed Up',
'description': 'md5:c8be487b2d80ff0594c005add88d8351',
'url': 'http://www.today.com/video/see-the-aurora-borealis-from-space-in-stunning-new-nasa-video-669831235788',
'md5': '118d7ca3f0bea6534f119c68ef539f71',
'info_dict': {
- 'id': '669831235788',
+ 'id': 'tdy_al_space_160420',
'ext': 'mp4',
'title': 'See the aurora borealis from space in stunning new NASA video',
'description': 'md5:74752b7358afb99939c5f8bb2d1d04b1',
'url': 'http://www.msnbc.com/all-in-with-chris-hayes/watch/the-chaotic-gop-immigration-vote-314487875924',
'md5': '6d236bf4f3dddc226633ce6e2c3f814d',
'info_dict': {
- 'id': '314487875924',
+ 'id': 'n_hayes_Aimm_140801_272214',
'ext': 'mp4',
'title': 'The chaotic GOP immigration vote',
'description': 'The Republican House votes on a border bill that has no chance of getting through the Senate or signed by the President and is drawing criticism from all sides.',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'timestamp': 1406937606,
'upload_date': '20140802',
'uploader': 'NBCU-NEWS',
- 'categories': ['MSNBC/Topics/Franchise/Best of last night', 'MSNBC/Topics/General/Congress'],
},
},
{
else:
# "feature" and "nightly-news" pages use theplatform.com
video_id = mobj.group('mpx_id')
- if not video_id.isdigit():
- webpage = self._download_webpage(url, video_id)
- info = None
- bootstrap_json = self._search_regex(
- [r'(?m)(?:var\s+(?:bootstrapJson|playlistData)|NEWS\.videoObj)\s*=\s*({.+});?\s*$',
- r'videoObj\s*:\s*({.+})', r'data-video="([^"]+)"'],
- webpage, 'bootstrap json', default=None)
+ webpage = self._download_webpage(url, video_id)
+
+ filter_param = 'byId'
+ bootstrap_json = self._search_regex(
+ [r'(?m)(?:var\s+(?:bootstrapJson|playlistData)|NEWS\.videoObj)\s*=\s*({.+});?\s*$',
+ r'videoObj\s*:\s*({.+})', r'data-video="([^"]+)"',
+ r'jQuery\.extend\(Drupal\.settings\s*,\s*({.+?})\);'],
+ webpage, 'bootstrap json', default=None)
+ if bootstrap_json:
bootstrap = self._parse_json(
bootstrap_json, video_id, transform_source=unescapeHTML)
+
+ info = None
if 'results' in bootstrap:
info = bootstrap['results'][0]['video']
elif 'video' in bootstrap:
info = bootstrap['video']
+ elif 'msnbcVideoInfo' in bootstrap:
+ info = bootstrap['msnbcVideoInfo']['meta']
+ elif 'msnbcThePlatform' in bootstrap:
+ info = bootstrap['msnbcThePlatform']['videoPlayer']['video']
else:
info = bootstrap
- video_id = info['mpxId']
+
+ if 'guid' in info:
+ video_id = info['guid']
+ filter_param = 'byGuid'
+ elif 'mpxId' in info:
+ video_id = info['mpxId']
return {
'_type': 'url_transparent',
'id': video_id,
# http://feed.theplatform.com/f/2E2eJC/nbcnews also works
- 'url': 'http://feed.theplatform.com/f/2E2eJC/nnd_NBCNews?byId=%s' % video_id,
+ 'url': update_url_query('http://feed.theplatform.com/f/2E2eJC/nnd_NBCNews', {filter_param: video_id}),
'ie_key': 'ThePlatformFeed',
}
'info_dict': {
'id': 'livestream217',
'ext': 'flv',
- 'title': 're:^NDR Fernsehen Niedersachsen \d{4}-\d{2}-\d{2} \d{2}:\d{2}$',
+ 'title': r're:^NDR Fernsehen Niedersachsen \d{4}-\d{2}-\d{2} \d{2}:\d{2}$',
'is_live': True,
'upload_date': '20150910',
},
'info_dict': {
'id': 'webradioweltweit100',
'ext': 'mp3',
- 'title': 're:^N-JOY Weltweit \d{4}-\d{2}-\d{2} \d{2}:\d{2}$',
+ 'title': r're:^N-JOY Weltweit \d{4}-\d{2}-\d{2} \d{2}:\d{2}$',
'is_live': True,
'uploader': 'njoy',
'upload_date': '20150810',
'description': 'md5:ab2d4b4a6056c5cb4caa6d729deabf02',
'upload_date': '20131208',
'duration': 1327,
- 'thumbnail': 're:https?://.*\.jpg',
+ 'thumbnail': r're:https?://.*\.jpg',
},
}
'comments': 'mincount:3',
'description': 'md5:1eddeacc7e62d5a25a2d1a7290c64a28',
'upload_date': '20120813',
- 'thumbnail': 're:https?://.*\.jpg$',
+ 'thumbnail': r're:https?://.*\.jpg$',
'timestamp': 1344858571,
'age_limit': 12,
},
from __future__ import unicode_literals
from .common import InfoExtractor
-from ..utils import parse_iso8601
+from ..compat import compat_urlparse
+from ..utils import (
+ clean_html,
+ get_element_by_class,
+ int_or_none,
+ parse_iso8601,
+ remove_start,
+ unified_timestamp,
+)
class NextMediaIE(InfoExtractor):
'id': '53109199',
'ext': 'mp4',
'title': '【佔領金鐘】50外國領事議員撐場 讚學生勇敢香港有希望',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'description': 'md5:28222b9912b6665a21011b034c70fcc7',
'timestamp': 1415456273,
'upload_date': '20141108',
return self._extract_from_nextmedia_page(news_id, url, page)
def _extract_from_nextmedia_page(self, news_id, url, page):
+ redirection_url = self._search_regex(
+ r'window\.location\.href\s*=\s*([\'"])(?P<url>(?!\1).+)\1',
+ page, 'redirection URL', default=None, group='url')
+ if redirection_url:
+ return self.url_result(compat_urlparse.urljoin(url, redirection_url))
+
title = self._fetch_title(page)
video_url = self._search_regex(self._URL_PATTERN, page, 'video url')
'id': '19009428',
'ext': 'mp4',
'title': '【壹週刊】細10年男友偷食 50歲邵美琪再失戀',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'description': 'md5:cd802fad1f40fd9ea178c1e2af02d659',
'timestamp': 1421791200,
'upload_date': '20150120',
class AppleDailyIE(NextMediaIE):
IE_DESC = '臺灣蘋果日報'
- _VALID_URL = r'https?://(www|ent)\.appledaily\.com\.tw/(?:animation|appledaily|enews|realtimenews|actionnews)/[^/]+/[^/]+/(?P<date>\d+)/(?P<id>\d+)(/.*)?'
+ _VALID_URL = r'https?://(www|ent)\.appledaily\.com\.tw/[^/]+/[^/]+/[^/]+/(?P<date>\d+)/(?P<id>\d+)(/.*)?'
_TESTS = [{
'url': 'http://ent.appledaily.com.tw/enews/article/entertainment/20150128/36354694',
'md5': 'a843ab23d150977cc55ef94f1e2c1e4d',
'id': '36354694',
'ext': 'mp4',
'title': '周亭羽走過摩鐵陰霾2男陪吃 九把刀孤寒看醫生',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'description': 'md5:2acd430e59956dc47cd7f67cb3c003f4',
'upload_date': '20150128',
}
'id': '550549',
'ext': 'mp4',
'title': '不滿被踩腳 山東兩大媽一路打下車',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'description': 'md5:175b4260c1d7c085993474217e4ab1b4',
'upload_date': '20150128',
}
'id': '5003671',
'ext': 'mp4',
'title': '20正妹熱舞 《刀龍傳說Online》火辣上市',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'description': 'md5:23c0aac567dc08c9c16a3161a2c2e3cd',
'upload_date': '20150128',
},
'id': '35770334',
'ext': 'mp4',
'title': '咖啡占卜測 XU裝熟指數',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'description': 'md5:7b859991a6a4fedbdf3dd3b66545c748',
'upload_date': '20140417',
},
}, {
'url': 'http://www.appledaily.com.tw/actionnews/appledaily/7/20161003/960588/',
'only_matching': True,
+ }, {
+ # Redirected from http://ent.appledaily.com.tw/enews/article/entertainment/20150128/36354694
+ 'url': 'http://ent.appledaily.com.tw/section/article/headline/20150128/36354694',
+ 'only_matching': True,
}]
_URL_PATTERN = r'\{url: \'(.+)\'\}'
def _fetch_description(self, page):
return self._html_search_meta('description', page, 'news description')
+
+
+class NextTVIE(InfoExtractor):
+ IE_DESC = '壹電視'
+ _VALID_URL = r'https?://(?:www\.)?nexttv\.com\.tw/(?:[^/]+/)+(?P<id>\d+)'
+
+ _TEST = {
+ 'url': 'http://www.nexttv.com.tw/news/realtime/politics/11779671',
+ 'info_dict': {
+ 'id': '11779671',
+ 'ext': 'mp4',
+ 'title': '「超收稅」近4千億! 藍議員籲發消費券',
+ 'thumbnail': r're:^https?://.*\.jpg$',
+ 'timestamp': 1484825400,
+ 'upload_date': '20170119',
+ 'view_count': int,
+ },
+ }
+
+ def _real_extract(self, url):
+ video_id = self._match_id(url)
+
+ webpage = self._download_webpage(url, video_id)
+
+ title = self._html_search_regex(
+ r'<h1[^>]*>([^<]+)</h1>', webpage, 'title')
+
+ data = self._hidden_inputs(webpage)
+
+ video_url = data['ntt-vod-src-detailview']
+
+ date_str = get_element_by_class('date', webpage)
+ timestamp = unified_timestamp(date_str + '+0800') if date_str else None
+
+ view_count = int_or_none(remove_start(
+ clean_html(get_element_by_class('click', webpage)), '點閱:'))
+
+ return {
+ 'id': video_id,
+ 'title': title,
+ 'url': video_url,
+ 'thumbnail': data.get('ntt-vod-img-src'),
+ 'timestamp': timestamp,
+ 'view_count': view_count,
+ }
'description': 'md5:56323bfb0ac4ee5ab24bd05fdf3bf478',
'upload_date': '20140921',
'timestamp': 1411337580,
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
}
}, {
'url': 'http://prod.www.steelers.clubs.nfl.com/video-and-audio/videos/LIVE_Post_Game_vs_Browns/9d72f26a-9e2b-4718-84d3-09fb4046c266',
'description': 'md5:6a97f7e5ebeb4c0e69a418a89e0636e8',
'upload_date': '20131229',
'timestamp': 1388354455,
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
}
}, {
'url': 'http://www.nfl.com/news/story/0ap3000000467586/article/patriots-seahawks-involved-in-lategame-skirmish',
class NickIE(MTVServicesInfoExtractor):
# None of videos on the website are still alive?
IE_NAME = 'nick.com'
- _VALID_URL = r'https?://(?:www\.)?nick(?:jr)?\.com/(?:videos/clip|[^/]+/videos)/(?P<id>[^/?#.]+)'
+ _VALID_URL = r'https?://(?:(?:www|beta)\.)?nick(?:jr)?\.com/(?:[^/]+/)?(?:videos/clip|[^/]+/videos)/(?P<id>[^/?#.]+)'
_FEED_URL = 'http://udat.mtvnservices.com/service1/dispatch.htm'
_TESTS = [{
'url': 'http://www.nick.com/videos/clip/alvinnn-and-the-chipmunks-112-full-episode.html',
}, {
'url': 'http://www.nickjr.com/paw-patrol/videos/pups-save-a-goldrush-s3-ep302-full-episode/',
'only_matching': True,
+ }, {
+ 'url': 'http://beta.nick.com/nicky-ricky-dicky-and-dawn/videos/nicky-ricky-dicky-dawn-301-full-episode/',
+ 'only_matching': True,
}]
def _get_feed_query(self, uri):
from .common import InfoExtractor
from ..compat import (
- compat_urllib_parse_urlencode,
compat_urlparse,
)
from ..utils import (
'description': '(c) copyright 2008, Blender Foundation / www.bigbuckbunny.org',
'duration': 33,
},
+ 'skip': 'Requires an account',
}, {
# File downloaded with and without credentials are different, so omit
# the md5 field
'timestamp': 1304065916,
'duration': 209,
},
+ 'skip': 'Requires an account',
}, {
# 'video exists but is marked as "deleted"
# md5 is unstable
'description': 'deleted',
'title': 'ドラえもんエターナル第3話「決戦第3新東京市」<前編>',
'upload_date': '20071224',
- 'timestamp': 1198527840, # timestamp field has different value if logged in
+ 'timestamp': int, # timestamp field has different value if logged in
'duration': 304,
},
+ 'skip': 'Requires an account',
}, {
'url': 'http://www.nicovideo.jp/watch/so22543406',
'info_dict': {
'upload_date': '20140104',
'uploader': 'アニメロチャンネル',
'uploader_id': '312',
- }
+ },
+ 'skip': 'The viewing period of the video you were searching for has expired.',
}]
_VALID_URL = r'https?://(?:www\.|secure\.)?nicovideo\.jp/watch/(?P<id>(?:[a-z]{2})?[0-9]+)'
_NETRC_MACHINE = 'niconico'
- # Determine whether the downloader used authentication to download video
- _AUTHENTICATED = False
def _real_initialize(self):
self._login()
if re.search(r'(?i)<h1 class="mb8p4">Log in error</h1>', login_results) is not None:
self._downloader.report_warning('unable to log in: bad username or password')
return False
- # Successful login
- self._AUTHENTICATED = True
return True
def _real_extract(self, url):
'http://ext.nicovideo.jp/api/getthumbinfo/' + video_id, video_id,
note='Downloading video info page')
- if self._AUTHENTICATED:
- # Get flv info
- flv_info_webpage = self._download_webpage(
- 'http://flapi.nicovideo.jp/api/getflv/' + video_id + '?as3=1',
- video_id, 'Downloading flv info')
- else:
- # Get external player info
- ext_player_info = self._download_webpage(
- 'http://ext.nicovideo.jp/thumb_watch/' + video_id, video_id)
- thumb_play_key = self._search_regex(
- r'\'thumbPlayKey\'\s*:\s*\'(.*?)\'', ext_player_info, 'thumbPlayKey')
-
- # Get flv info
- flv_info_data = compat_urllib_parse_urlencode({
- 'k': thumb_play_key,
- 'v': video_id
- })
- flv_info_request = sanitized_Request(
- 'http://ext.nicovideo.jp/thumb_watch', flv_info_data,
- {'Content-Type': 'application/x-www-form-urlencoded'})
- flv_info_webpage = self._download_webpage(
- flv_info_request, video_id,
- note='Downloading flv info', errnote='Unable to download flv info')
+ # Get flv info
+ flv_info_webpage = self._download_webpage(
+ 'http://flapi.nicovideo.jp/api/getflv/' + video_id + '?as3=1',
+ video_id, 'Downloading flv info')
flv_info = compat_urlparse.parse_qs(flv_info_webpage)
if 'url' not in flv_info:
if 'deleted' in flv_info:
raise ExtractorError('The video has been deleted.',
expected=True)
+ elif 'closed' in flv_info:
+ raise ExtractorError('Niconico videos now require logging in',
+ expected=True)
else:
raise ExtractorError('Unable to find video URL')
class NosVideoIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?nosvideo\.com/' + \
- '(?:embed/|\?v=)(?P<id>[A-Za-z0-9]{12})/?'
+ r'(?:embed/|\?v=)(?P<id>[A-Za-z0-9]{12})/?'
_PLAYLIST_URL = 'http://nosvideo.com/xml/{xml_id:s}.xml'
_FILE_DELETED_REGEX = r'<b>File Not Found</b>'
_TEST = {
'id': 'mu8fle7g7rpq',
'ext': 'mp4',
'title': 'big_buck_bunny_480p_surround-fix.avi.mp4',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
}
}
'ext': 'flv',
'title': 'Duel: Michal Hrdlička a Petr Suchoň',
'description': 'md5:d0cc509858eee1b1374111c588c6f5d5',
- 'thumbnail': 're:^https?://.*\.(?:jpg)',
+ 'thumbnail': r're:^https?://.*\.(?:jpg)',
},
'params': {
# rtmp download
'ext': 'mp4',
'title': 'Podzemní nemocnice v pražské Krči',
'description': 'md5:f0a42dd239c26f61c28f19e62d20ef53',
- 'thumbnail': 're:^https?://.*\.(?:jpg)',
+ 'thumbnail': r're:^https?://.*\.(?:jpg)',
}
}, {
'url': 'http://novaplus.nova.cz/porad/policie-modrava/video/5591-policie-modrava-15-dil-blondynka-na-hrbitove',
'ext': 'flv',
'title': 'Policie Modrava - 15. díl - Blondýnka na hřbitově',
'description': 'md5:dc24e50be5908df83348e50d1431295e', # Make sure this description is clean of html tags
- 'thumbnail': 're:^https?://.*\.(?:jpg)',
+ 'thumbnail': r're:^https?://.*\.(?:jpg)',
},
'params': {
# rtmp download
'id': '1756858',
'ext': 'flv',
'title': 'Televizní noviny - 30. 5. 2015',
- 'thumbnail': 're:^https?://.*\.(?:jpg)',
+ 'thumbnail': r're:^https?://.*\.(?:jpg)',
'upload_date': '20150530',
},
'params': {
'ext': 'mp4',
'title': 'Zaklínač 3: Divoký hon',
'description': 're:.*Pokud se stejně jako my nemůžete.*',
- 'thumbnail': 're:https?://.*\.jpg(\?.*)?',
+ 'thumbnail': r're:https?://.*\.jpg(\?.*)?',
'upload_date': '20150521',
},
'params': {
)
(?P<id>[a-z\d]{13})
'''
- _VALID_URL = _VALID_URL_TEMPLATE % {'host': 'novamov\.com'}
+ _VALID_URL = _VALID_URL_TEMPLATE % {'host': r'novamov\.com'}
_HOST = 'www.novamov.com'
IE_NAME = 'wholecloud'
IE_DESC = 'WholeCloud'
- _VALID_URL = NovaMovIE._VALID_URL_TEMPLATE % {'host': '(?:wholecloud\.net|movshare\.(?:net|sx|ag))'}
+ _VALID_URL = NovaMovIE._VALID_URL_TEMPLATE % {'host': r'(?:wholecloud\.net|movshare\.(?:net|sx|ag))'}
_HOST = 'www.wholecloud.net'
IE_NAME = 'nowvideo'
IE_DESC = 'NowVideo'
- _VALID_URL = NovaMovIE._VALID_URL_TEMPLATE % {'host': 'nowvideo\.(?:to|ch|ec|sx|eu|at|ag|co|li)'}
+ _VALID_URL = NovaMovIE._VALID_URL_TEMPLATE % {'host': r'nowvideo\.(?:to|ch|ec|sx|eu|at|ag|co|li)'}
_HOST = 'www.nowvideo.to'
IE_NAME = 'videoweed'
IE_DESC = 'VideoWeed'
- _VALID_URL = NovaMovIE._VALID_URL_TEMPLATE % {'host': 'videoweed\.(?:es|com)'}
+ _VALID_URL = NovaMovIE._VALID_URL_TEMPLATE % {'host': r'videoweed\.(?:es|com)'}
_HOST = 'www.videoweed.es'
IE_NAME = 'cloudtime'
IE_DESC = 'CloudTime'
- _VALID_URL = NovaMovIE._VALID_URL_TEMPLATE % {'host': 'cloudtime\.to'}
+ _VALID_URL = NovaMovIE._VALID_URL_TEMPLATE % {'host': r'cloudtime\.to'}
_HOST = 'www.cloudtime.to'
IE_NAME = 'auroravid'
IE_DESC = 'AuroraVid'
- _VALID_URL = NovaMovIE._VALID_URL_TEMPLATE % {'host': 'auroravid\.to'}
+ _VALID_URL = NovaMovIE._VALID_URL_TEMPLATE % {'host': r'auroravid\.to'}
_HOST = 'www.auroravid.to'
'ext': 'mp4',
'title': 'Candor: The Art of Gesticulation',
'description': 'Candor: The Art of Gesticulation',
- 'thumbnail': 're:^https?://.*\.jpg',
+ 'thumbnail': r're:^https?://.*\.jpg',
'timestamp': 1446745676,
'upload_date': '20151105',
'uploader_id': '2385340575001',
'ext': 'mp4',
'title': 'Kasper Bjørke ft. Jaakko Eino Kalevi: TNR',
'description': 'Kasper Bjørke ft. Jaakko Eino Kalevi: TNR',
- 'thumbnail': 're:^https?://.*\.jpg',
+ 'thumbnail': r're:^https?://.*\.jpg',
'timestamp': 1407315371,
'upload_date': '20140806',
'uploader_id': '2385340575001',
'ext': 'mp4',
'title': 'Bleu, Blanc, Rouge - A Godard Supercut',
'description': 'md5:f0ea5f1857dffca02dbd37875d742cec',
- 'thumbnail': 're:^https?://.*\.jpg',
+ 'thumbnail': r're:^https?://.*\.jpg',
'upload_date': '20150607',
'uploader': 'Cinema Sem Lei',
'uploader_id': 'cinemasemlei',
'ext': 'flv',
'title': 'Inka Bause stellt die neuen Bauern vor',
'description': 'md5:e234e1ed6d63cf06be5c070442612e7e',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'timestamp': 1432580700,
'upload_date': '20150525',
'duration': 2786,
'ext': 'flv',
'title': 'Berlin - Tag & Nacht (Folge 934)',
'description': 'md5:c85e88c2e36c552dfe63433bc9506dd0',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'timestamp': 1432666800,
'upload_date': '20150526',
'duration': 2641,
'ext': 'flv',
'title': 'Hals- und Beinbruch',
'description': 'md5:b50d248efffe244e6f56737f0911ca57',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'timestamp': 1432415400,
'upload_date': '20150523',
'duration': 2742,
'ext': 'flv',
'title': 'Angst!',
'description': 'md5:30cbc4c0b73ec98bcd73c9f2a8c17c4e',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'timestamp': 1222632900,
'upload_date': '20080928',
'duration': 3025,
'ext': 'flv',
'title': 'Thema u.a.: Der erste Blick: Die Apple Watch',
'description': 'md5:4312b6c9d839ffe7d8caf03865a531af',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'timestamp': 1432751700,
'upload_date': '20150527',
'duration': 1083,
'ext': 'flv',
'title': "Büro-Fall / Chihuahua 'Joel'",
'description': 'md5:e62cb6bf7c3cc669179d4f1eb279ad8d',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'timestamp': 1432408200,
'upload_date': '20150523',
'duration': 3092,
'duration': 215,
'title': '3:2 - Deutschland gewinnt Badminton-Länderspiel in Melle',
'description': 'Vor rund 370 Zuschauern gewinnt die deutsche Badminton-Nationalmannschaft am Donnerstag ein EM-Vorbereitungsspiel gegen Frankreich in Melle. Video Moritz Frankenberg.',
- 'thumbnail': 're:^http://.*\.jpg',
+ 'thumbnail': r're:^http://.*\.jpg',
},
}]
if metadata.get('tt888') == 'ja':
subtitles['nl'] = [{
'ext': 'vtt',
- 'url': 'http://e.omroep.nl/tt888/%s' % video_id,
+ 'url': 'http://tt888.omroep.nl/tt888/%s' % video_id,
}]
return {
entries = []
+ conviva = data.get('convivaStatistics') or {}
+ live = (data.get('mediaElementType') == 'Live' or
+ data.get('isLive') is True or conviva.get('isLive'))
+
+ def make_title(t):
+ return self._live_title(t) if live else t
+
media_assets = data.get('mediaAssets')
if media_assets and isinstance(media_assets, list):
def video_id_and_title(idx):
if not formats:
continue
self._sort_formats(formats)
+
+ # Some f4m streams may not work with hdcore in fragments' URLs
+ for f in formats:
+ extra_param = f.get('extra_param_to_segment_url')
+ if extra_param and 'hdcore' in extra_param:
+ del f['extra_param_to_segment_url']
+
entry_id, entry_title = video_id_and_title(num)
duration = parse_duration(asset.get('duration'))
subtitles = {}
})
entries.append({
'id': asset.get('carrierId') or entry_id,
- 'title': entry_title,
+ 'title': make_title(entry_title),
'duration': duration,
'subtitles': subtitles,
'formats': formats,
duration = parse_duration(data.get('duration'))
entries = [{
'id': video_id,
- 'title': title,
+ 'title': make_title(title),
'duration': duration,
'formats': formats,
}]
message_type, message_type)),
expected=True)
- conviva = data.get('convivaStatistics') or {}
series = conviva.get('seriesName') or data.get('seriesTitle')
episode = conviva.get('episodeName') or data.get('episodeNumberOrDate')
+ season_number = None
+ episode_number = None
+ if data.get('mediaElementType') == 'Episode':
+ _season_episode = data.get('scoresStatistics', {}).get('springStreamStream') or \
+ data.get('relativeOriginUrl', '')
+ EPISODENUM_RE = [
+ r'/s(?P<season>\d{,2})e(?P<episode>\d{,2})\.',
+ r'/sesong-(?P<season>\d{,2})/episode-(?P<episode>\d{,2})',
+ ]
+ season_number = int_or_none(self._search_regex(
+ EPISODENUM_RE, _season_episode, 'season number',
+ default=None, group='season'))
+ episode_number = int_or_none(self._search_regex(
+ EPISODENUM_RE, _season_episode, 'episode number',
+ default=None, group='episode'))
+
thumbnails = None
images = data.get('images')
if images and isinstance(images, dict):
} for image in web_images if image.get('imageUrl')]
description = data.get('description')
+ category = data.get('mediaAnalytics', {}).get('category')
common_info = {
'description': description,
'series': series,
'episode': episode,
+ 'season_number': season_number,
+ 'episode_number': episode_number,
+ 'categories': [category] if category else None,
'age_limit': parse_age_limit(data.get('legalAge')),
'thumbnails': thumbnails,
}
class NRKTVIE(NRKBaseIE):
IE_DESC = 'NRK TV and NRK Radio'
- _VALID_URL = r'https?://(?:tv|radio)\.nrk(?:super)?\.no/(?:serie/[^/]+|program)/(?P<id>[a-zA-Z]{4}\d{8})(?:/\d{2}-\d{2}-\d{4})?(?:#del=(?P<part_id>\d+))?'
+ _EPISODE_RE = r'(?P<id>[a-zA-Z]{4}\d{8})'
+ _VALID_URL = r'''(?x)
+ https?://
+ (?:tv|radio)\.nrk(?:super)?\.no/
+ (?:serie/[^/]+|program)/
+ (?![Ee]pisodes)%s
+ (?:/\d{2}-\d{2}-\d{4})?
+ (?:\#del=(?P<part_id>\d+))?
+ ''' % _EPISODE_RE
_API_HOST = 'psapi-we.nrk.no'
_TESTS = [{
'title': '20 spørsmål 23.05.2014',
'description': 'md5:bdea103bc35494c143c6a9acdd84887a',
'duration': 1741,
+ 'series': '20 spørsmål - TV',
+ 'episode': '23.05.2014',
},
}, {
'url': 'https://tv.nrk.no/program/mdfp15000514',
- 'md5': '43d0be26663d380603a9cf0c24366531',
'info_dict': {
'id': 'MDFP15000514CA',
'ext': 'mp4',
'title': 'Grunnlovsjubiléet - Stor ståhei for ingenting 24.05.2014',
'description': 'md5:89290c5ccde1b3a24bb8050ab67fe1db',
'duration': 4605,
+ 'series': 'Kunnskapskanalen',
+ 'episode': '24.05.2014',
+ },
+ 'params': {
+ 'skip_download': True,
},
}, {
# single playlist video
'url': 'https://tv.nrk.no/serie/tour-de-ski/MSPO40010515/06-01-2015#del=2',
- 'md5': 'adbd1dbd813edaf532b0a253780719c2',
'info_dict': {
'id': 'MSPO40010515-part2',
'ext': 'flv',
'title': 'Tour de Ski: Sprint fri teknikk, kvinner og menn 06.01.2015 (del 2:2)',
'description': 'md5:238b67b97a4ac7d7b4bf0edf8cc57d26',
},
- 'skip': 'Only works from Norway',
+ 'params': {
+ 'skip_download': True,
+ },
+ 'expected_warnings': ['Video is geo restricted'],
+ 'skip': 'particular part is not supported currently',
}, {
'url': 'https://tv.nrk.no/serie/tour-de-ski/MSPO40010515/06-01-2015',
'playlist': [{
- 'md5': '9480285eff92d64f06e02a5367970a7a',
'info_dict': {
- 'id': 'MSPO40010515-part1',
- 'ext': 'flv',
- 'title': 'Tour de Ski: Sprint fri teknikk, kvinner og menn 06.01.2015 (del 1:2)',
- 'description': 'md5:238b67b97a4ac7d7b4bf0edf8cc57d26',
+ 'id': 'MSPO40010515AH',
+ 'ext': 'mp4',
+ 'title': 'Sprint fri teknikk, kvinner og menn 06.01.2015 (Part 1)',
+ 'description': 'md5:c03aba1e917561eface5214020551b7a',
+ 'duration': 772,
+ 'series': 'Tour de Ski',
+ 'episode': '06.01.2015',
+ },
+ 'params': {
+ 'skip_download': True,
},
}, {
- 'md5': 'adbd1dbd813edaf532b0a253780719c2',
'info_dict': {
- 'id': 'MSPO40010515-part2',
- 'ext': 'flv',
- 'title': 'Tour de Ski: Sprint fri teknikk, kvinner og menn 06.01.2015 (del 2:2)',
- 'description': 'md5:238b67b97a4ac7d7b4bf0edf8cc57d26',
+ 'id': 'MSPO40010515BH',
+ 'ext': 'mp4',
+ 'title': 'Sprint fri teknikk, kvinner og menn 06.01.2015 (Part 2)',
+ 'description': 'md5:c03aba1e917561eface5214020551b7a',
+ 'duration': 6175,
+ 'series': 'Tour de Ski',
+ 'episode': '06.01.2015',
+ },
+ 'params': {
+ 'skip_download': True,
},
}],
'info_dict': {
'id': 'MSPO40010515',
- 'title': 'Tour de Ski: Sprint fri teknikk, kvinner og menn',
- 'description': 'md5:238b67b97a4ac7d7b4bf0edf8cc57d26',
- 'duration': 6947.52,
+ 'title': 'Sprint fri teknikk, kvinner og menn 06.01.2015',
+ 'description': 'md5:c03aba1e917561eface5214020551b7a',
+ },
+ 'expected_warnings': ['Video is geo restricted'],
+ }, {
+ 'url': 'https://tv.nrk.no/serie/anno/KMTE50001317/sesong-3/episode-13',
+ 'info_dict': {
+ 'id': 'KMTE50001317AA',
+ 'ext': 'mp4',
+ 'title': 'Anno 13:30',
+ 'description': 'md5:11d9613661a8dbe6f9bef54e3a4cbbfa',
+ 'duration': 2340,
+ 'series': 'Anno',
+ 'episode': '13:30',
+ 'season_number': 3,
+ 'episode_number': 13,
+ },
+ 'params': {
+ 'skip_download': True,
+ },
+ }, {
+ 'url': 'https://tv.nrk.no/serie/nytt-paa-nytt/MUHH46000317/27-01-2017',
+ 'info_dict': {
+ 'id': 'MUHH46000317AA',
+ 'ext': 'mp4',
+ 'title': 'Nytt på Nytt 27.01.2017',
+ 'description': 'md5:5358d6388fba0ea6f0b6d11c48b9eb4b',
+ 'duration': 1796,
+ 'series': 'Nytt på nytt',
+ 'episode': '27.01.2017',
+ },
+ 'params': {
+ 'skip_download': True,
},
- 'skip': 'Only works from Norway',
}, {
'url': 'https://radio.nrk.no/serie/dagsnytt/NPUB21019315/12-07-2015#',
'only_matching': True,
}]
-class NRKPlaylistIE(InfoExtractor):
- _VALID_URL = r'https?://(?:www\.)?nrk\.no/(?!video|skole)(?:[^/]+/)+(?P<id>[^/]+)'
+class NRKTVDirekteIE(NRKTVIE):
+ IE_DESC = 'NRK TV Direkte and NRK Radio Direkte'
+ _VALID_URL = r'https?://(?:tv|radio)\.nrk\.no/direkte/(?P<id>[^/?#&]+)'
+
+ _TESTS = [{
+ 'url': 'https://tv.nrk.no/direkte/nrk1',
+ 'only_matching': True,
+ }, {
+ 'url': 'https://radio.nrk.no/direkte/p1_oslo_akershus',
+ 'only_matching': True,
+ }]
+
+
+class NRKPlaylistBaseIE(InfoExtractor):
+ def _extract_description(self, webpage):
+ pass
+
+ def _real_extract(self, url):
+ playlist_id = self._match_id(url)
+
+ webpage = self._download_webpage(url, playlist_id)
+
+ entries = [
+ self.url_result('nrk:%s' % video_id, NRKIE.ie_key())
+ for video_id in re.findall(self._ITEM_RE, webpage)
+ ]
+
+ playlist_title = self. _extract_title(webpage)
+ playlist_description = self._extract_description(webpage)
+
+ return self.playlist_result(
+ entries, playlist_id, playlist_title, playlist_description)
+
+class NRKPlaylistIE(NRKPlaylistBaseIE):
+ _VALID_URL = r'https?://(?:www\.)?nrk\.no/(?!video|skole)(?:[^/]+/)+(?P<id>[^/]+)'
+ _ITEM_RE = r'class="[^"]*\brich\b[^"]*"[^>]+data-video-id="([^"]+)"'
_TESTS = [{
'url': 'http://www.nrk.no/troms/gjenopplev-den-historiske-solformorkelsen-1.12270763',
'info_dict': {
'playlist_count': 5,
}]
+ def _extract_title(self, webpage):
+ return self._og_search_title(webpage, fatal=False)
+
+ def _extract_description(self, webpage):
+ return self._og_search_description(webpage)
+
+
+class NRKTVEpisodesIE(NRKPlaylistBaseIE):
+ _VALID_URL = r'https?://tv\.nrk\.no/program/[Ee]pisodes/[^/]+/(?P<id>\d+)'
+ _ITEM_RE = r'data-episode=["\']%s' % NRKTVIE._EPISODE_RE
+ _TESTS = [{
+ 'url': 'https://tv.nrk.no/program/episodes/nytt-paa-nytt/69031',
+ 'info_dict': {
+ 'id': '69031',
+ 'title': 'Nytt på nytt, sesong: 201210',
+ },
+ 'playlist_count': 4,
+ }]
+
+ def _extract_title(self, webpage):
+ return self._html_search_regex(
+ r'<h1>([^<]+)</h1>', webpage, 'title', fatal=False)
+
+
+class NRKTVSeriesIE(InfoExtractor):
+ _VALID_URL = r'https?://(?:tv|radio)\.nrk(?:super)?\.no/serie/(?P<id>[^/]+)'
+ _ITEM_RE = r'(?:data-season=["\']|id=["\']season-)(?P<id>\d+)'
+ _TESTS = [{
+ 'url': 'https://tv.nrk.no/serie/groenn-glede',
+ 'info_dict': {
+ 'id': 'groenn-glede',
+ 'title': 'Grønn glede',
+ 'description': 'md5:7576e92ae7f65da6993cf90ee29e4608',
+ },
+ 'playlist_mincount': 9,
+ }, {
+ 'url': 'http://tv.nrksuper.no/serie/labyrint',
+ 'info_dict': {
+ 'id': 'labyrint',
+ 'title': 'Labyrint',
+ 'description': 'md5:58afd450974c89e27d5a19212eee7115',
+ },
+ 'playlist_mincount': 3,
+ }, {
+ 'url': 'https://tv.nrk.no/serie/broedrene-dal-og-spektralsteinene',
+ 'only_matching': True,
+ }, {
+ 'url': 'https://tv.nrk.no/serie/saving-the-human-race',
+ 'only_matching': True,
+ }, {
+ 'url': 'https://tv.nrk.no/serie/postmann-pat',
+ 'only_matching': True,
+ }]
+
+ @classmethod
+ def suitable(cls, url):
+ return False if NRKTVIE.suitable(url) else super(NRKTVSeriesIE, cls).suitable(url)
+
def _real_extract(self, url):
- playlist_id = self._match_id(url)
+ series_id = self._match_id(url)
- webpage = self._download_webpage(url, playlist_id)
+ webpage = self._download_webpage(url, series_id)
entries = [
- self.url_result('nrk:%s' % video_id, 'NRK')
- for video_id in re.findall(
- r'class="[^"]*\brich\b[^"]*"[^>]+data-video-id="([^"]+)"',
- webpage)
+ self.url_result(
+ 'https://tv.nrk.no/program/Episodes/{series}/{season}'.format(
+ series=series_id, season=season_id))
+ for season_id in re.findall(self._ITEM_RE, webpage)
]
- playlist_title = self._og_search_title(webpage)
- playlist_description = self._og_search_description(webpage)
+ title = self._html_search_meta(
+ 'seriestitle', webpage,
+ 'title', default=None) or self._og_search_title(
+ webpage, fatal=False)
- return self.playlist_result(
- entries, playlist_id, playlist_title, playlist_description)
+ description = self._html_search_meta(
+ 'series_description', webpage,
+ 'description', default=None) or self._og_search_description(webpage)
+
+ return self.playlist_result(entries, series_id, title, description)
class NRKSkoleIE(InfoExtractor):
'info_dict': {
'id': '14438086',
'ext': 'mp4',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'title': 'Schnee und Glätte führen zu zahlreichen Unfällen und Staus',
'alt_title': 'Winterchaos auf deutschen Straßen',
'description': 'Schnee und Glätte sorgen deutschlandweit für einen chaotischen Start in die Woche: Auf den Straßen kommt es zu kilometerlangen Staus und Dutzenden Glätteunfällen. In Düsseldorf und München wirbelt der Schnee zudem den Flugplan durcheinander. Dutzende Flüge landen zu spät, einige fallen ganz aus.',
'ext': 'mp4',
'title': 'Командующий Черноморским флотом провел переговоры в штабе ВМС Украины',
'description': 'Командующий Черноморским флотом провел переговоры в штабе ВМС Украины',
- 'thumbnail': 're:^http://.*\.jpg',
+ 'thumbnail': r're:^http://.*\.jpg',
'duration': 136,
},
}, {
'ext': 'mp4',
'title': 'Родные пассажиров пропавшего Boeing не верят в трагический исход',
'description': 'Родные пассажиров пропавшего Boeing не верят в трагический исход',
- 'thumbnail': 're:^http://.*\.jpg',
+ 'thumbnail': r're:^http://.*\.jpg',
'duration': 172,
},
}, {
'ext': 'mp4',
'title': '«Сегодня». 21 марта 2014 года. 16:00',
'description': '«Сегодня». 21 марта 2014 года. 16:00',
- 'thumbnail': 're:^http://.*\.jpg',
+ 'thumbnail': r're:^http://.*\.jpg',
'duration': 1496,
},
}, {
'ext': 'mp4',
'title': 'Остросюжетный фильм «Кома»',
'description': 'Остросюжетный фильм «Кома»',
- 'thumbnail': 're:^http://.*\.jpg',
+ 'thumbnail': r're:^http://.*\.jpg',
'duration': 5592,
},
}, {
'ext': 'mp4',
'title': '«Дело врачей»: «Деревце жизни»',
'description': '«Дело врачей»: «Деревце жизни»',
- 'thumbnail': 're:^http://.*\.jpg',
+ 'thumbnail': r're:^http://.*\.jpg',
'duration': 2590,
},
}]
'id': 'hb-zelt',
'ext': 'mp4',
'title': 're:^Live-Kamera: Hofbräuzelt [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}$',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'is_live': True,
},
'params': {
--- /dev/null
+# coding: utf-8
+from __future__ import unicode_literals
+
+from .jwplatform import JWPlatformBaseIE
+from ..utils import (
+ ExtractorError,
+ js_to_json,
+)
+
+
+class OnDemandKoreaIE(JWPlatformBaseIE):
+ _VALID_URL = r'https?://(?:www\.)?ondemandkorea\.com/(?P<id>[^/]+)\.html'
+ _TEST = {
+ 'url': 'http://www.ondemandkorea.com/ask-us-anything-e43.html',
+ 'info_dict': {
+ 'id': 'ask-us-anything-e43',
+ 'ext': 'mp4',
+ 'title': 'Ask Us Anything : E43',
+ 'thumbnail': r're:^https?://.*\.jpg$',
+ },
+ 'params': {
+ 'skip_download': 'm3u8 download'
+ }
+ }
+
+ def _real_extract(self, url):
+ video_id = self._match_id(url)
+ webpage = self._download_webpage(url, video_id, fatal=False)
+
+ if not webpage:
+ # Page sometimes returns captcha page with HTTP 403
+ raise ExtractorError(
+ 'Unable to access page. You may have been blocked.',
+ expected=True)
+
+ if 'msg_block_01.png' in webpage:
+ self.raise_geo_restricted(
+ 'This content is not available in your region')
+
+ if 'This video is only available to ODK PLUS members.' in webpage:
+ raise ExtractorError(
+ 'This video is only available to ODK PLUS members.',
+ expected=True)
+
+ title = self._og_search_title(webpage)
+
+ jw_config = self._parse_json(
+ self._search_regex(
+ r'(?s)jwplayer\(([\'"])(?:(?!\1).)+\1\)\.setup\s*\((?P<options>.+?)\);',
+ webpage, 'jw config', group='options'),
+ video_id, transform_source=js_to_json)
+ info = self._parse_jwplayer_data(
+ jw_config, video_id, require_title=False, m3u8_id='hls',
+ base_url=url)
+
+ info.update({
+ 'title': title,
+ 'thumbnail': self._og_search_thumbnail(webpage),
+ })
+ return info
'id': '2937',
'ext': 'mp4',
'title': 'Hannibal charges forward, stops for a cocktail',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'uploader': 'The A.V. Club',
'uploader_id': 'the-av-club',
},
_CONTENT_TREE_BASE = _PLAYER_BASE + 'player_api/v1/content_tree/'
_AUTHORIZATION_URL_TEMPLATE = _PLAYER_BASE + 'sas/player_api/v2/authorization/embed_code/%s/%s?'
- def _extract(self, content_tree_url, video_id, domain='example.org', supportedformats=None):
+ def _extract(self, content_tree_url, video_id, domain='example.org', supportedformats=None, embed_token=None):
content_tree = self._download_json(content_tree_url, video_id)['content_tree']
metadata = content_tree[list(content_tree)[0]]
embed_code = metadata['embed_code']
self._AUTHORIZATION_URL_TEMPLATE % (pcode, embed_code) +
compat_urllib_parse_urlencode({
'domain': domain,
- 'supportedFormats': supportedformats or 'mp4,rtmp,m3u8,hds',
+ 'supportedFormats': supportedformats or 'mp4,rtmp,m3u8,hds,dash,smooth',
+ 'embedToken': embed_token,
}), video_id)
cur_auth_data = auth_data['authorization_data'][embed_code]
elif delivery_type == 'hds' or ext == 'f4m':
formats.extend(self._extract_f4m_formats(
s_url + '?hdcore=3.7.0', embed_code, f4m_id='hds', fatal=False))
+ elif delivery_type == 'dash' or ext == 'mpd':
+ formats.extend(self._extract_mpd_formats(
+ s_url, embed_code, mpd_id='dash', fatal=False))
+ elif delivery_type == 'smooth':
+ self._extract_ism_formats(
+ s_url, embed_code, ism_id='mss', fatal=False)
elif ext == 'smil':
formats.extend(self._extract_smil_formats(
s_url, embed_code, fatal=False))
embed_code = self._match_id(url)
domain = smuggled_data.get('domain')
supportedformats = smuggled_data.get('supportedformats')
+ embed_token = smuggled_data.get('embed_token')
content_tree_url = self._CONTENT_TREE_BASE + 'embed_code/%s/%s' % (embed_code, embed_code)
- return self._extract(content_tree_url, embed_code, domain, supportedformats)
+ return self._extract(content_tree_url, embed_code, domain, supportedformats, embed_token)
class OoyalaExternalIE(OoyalaBaseIE):
# coding: utf-8
-from __future__ import unicode_literals, division
+from __future__ import unicode_literals
import re
from .common import InfoExtractor
-from ..compat import (
- compat_chr,
- compat_ord,
-)
+from ..compat import compat_chr
from ..utils import (
determine_ext,
ExtractorError,
)
-from ..jsinterp import (
- JSInterpreter,
- _NAME_RE
-)
class OpenloadIE(InfoExtractor):
- _VALID_URL = r'https?://openload\.(?:co|io)/(?:f|embed)/(?P<id>[a-zA-Z0-9-_]+)'
+ _VALID_URL = r'https?://(?:openload\.(?:co|io)|oload\.tv)/(?:f|embed)/(?P<id>[a-zA-Z0-9-_]+)'
_TESTS = [{
'url': 'https://openload.co/f/kUEfGclsU9o',
'id': 'kUEfGclsU9o',
'ext': 'mp4',
'title': 'skyrim_no-audio_1080.mp4',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
},
}, {
'url': 'https://openload.co/embed/rjC09fkPLYs',
'id': 'rjC09fkPLYs',
'ext': 'mp4',
'title': 'movie.mp4',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'subtitles': {
'en': [{
'ext': 'vtt',
# for title and ext
'url': 'https://openload.co/embed/Sxz5sADo82g/',
'only_matching': True,
+ }, {
+ 'url': 'https://oload.tv/embed/KnG-kKZdcfY/',
+ 'only_matching': True,
}]
- def openload_decode(self, txt):
- symbol_dict = {
- '(゚Д゚) [゚Θ゚]': '_',
- '(゚Д゚) [゚ω゚ノ]': 'a',
- '(゚Д゚) [゚Θ゚ノ]': 'b',
- '(゚Д゚) [\'c\']': 'c',
- '(゚Д゚) [゚ー゚ノ]': 'd',
- '(゚Д゚) [゚Д゚ノ]': 'e',
- '(゚Д゚) [1]': 'f',
- '(゚Д゚) [\'o\']': 'o',
- '(o゚ー゚o)': 'u',
- '(゚Д゚) [\'c\']': 'c',
- '((゚ー゚) + (o^_^o))': '7',
- '((o^_^o) +(o^_^o) +(c^_^o))': '6',
- '((゚ー゚) + (゚Θ゚))': '5',
- '(-~3)': '4',
- '(-~-~1)': '3',
- '(-~1)': '2',
- '(-~0)': '1',
- '((c^_^o)-(c^_^o))': '0',
- }
- delim = '(゚Д゚)[゚ε゚]+'
- end_token = '(゚Д゚)[゚o゚]'
- symbols = '|'.join(map(re.escape, symbol_dict.keys()))
- txt = re.sub('(%s)\+\s?' % symbols, lambda m: symbol_dict[m.group(1)], txt)
- ret = ''
- for aacode in re.findall(r'{0}\+\s?{1}(.*?){0}'.format(re.escape(end_token), re.escape(delim)), txt):
- for aachar in aacode.split(delim):
- if aachar.isdigit():
- ret += compat_chr(int(aachar, 8))
- else:
- m = re.match(r'^u([\da-f]{4})$', aachar)
- if m:
- ret += compat_chr(int(m.group(1), 16))
- else:
- self.report_warning("Cannot decode: %s" % aachar)
- return ret
+ @staticmethod
+ def _extract_urls(webpage):
+ return re.findall(
+ r'<iframe[^>]+src=["\']((?:https?://)?(?:openload\.(?:co|io)|oload\.tv)/embed/[a-zA-Z0-9-_]+)',
+ webpage)
def _real_extract(self, url):
video_id = self._match_id(url)
if 'File not found' in webpage or 'deleted by the owner' in webpage:
raise ExtractorError('File not found', expected=True)
- # The following decryption algorithm is written by @yokrysty and
- # declared to be freely used in youtube-dl
- # See https://github.com/rg3/youtube-dl/issues/10408
- enc_data = self._html_search_regex(
- r'<span[^>]*>([^<]+)</span>\s*<span[^>]*>[^<]+</span>\s*<span[^>]+id="streamurl"',
- webpage, 'encrypted data')
+ ol_id = self._search_regex(
+ '<span[^>]+id="[^"]+"[^>]*>([0-9]+)</span>',
+ webpage, 'openload ID')
- enc_code = self._html_search_regex(r'<script[^>]+>(゚ω゚[^<]+)</script>',
- webpage, 'encrypted code')
+ first_three_chars = int(float(ol_id[0:][:3]))
+ fifth_char = int(float(ol_id[3:5]))
+ urlcode = ''
+ num = 5
- js_code = self.openload_decode(enc_code)
- jsi = JSInterpreter(js_code)
+ while num < len(ol_id):
+ urlcode += compat_chr(int(float(ol_id[num:][:3])) +
+ first_three_chars - fifth_char * int(float(ol_id[num + 3:][:2])))
+ num += 5
- m_offset_fun = self._search_regex(r'slice\(0\s*-\s*(%s)\(\)' % _NAME_RE, js_code, 'javascript offset function')
- m_diff_fun = self._search_regex(r'charCodeAt\(0\)\s*\+\s*(%s)\(\)' % _NAME_RE, js_code, 'javascript diff function')
-
- offset = jsi.call_function(m_offset_fun)
- diff = jsi.call_function(m_diff_fun)
-
- video_url_chars = []
-
- for idx, c in enumerate(enc_data):
- j = compat_ord(c)
- if j >= 33 and j <= 126:
- j = ((j + 14) % 94) + 33
- if idx == len(enc_data) - offset:
- j += diff
- video_url_chars += compat_chr(j)
-
- video_url = 'https://openload.co/stream/%s?mime=true' % ''.join(video_url_chars)
+ video_url = 'https://openload.co/stream/' + urlcode
title = self._og_search_title(webpage, default=None) or self._search_regex(
r'<span[^>]+class=["\']title["\'][^>]*>([^<]+)', webpage,
'thumbnail': self._og_search_thumbnail(webpage, default=None),
'url': video_url,
# Seems all videos have extensions in their titles
- 'ext': determine_ext(title),
+ 'ext': determine_ext(title, 'mp4'),
'subtitles': subtitles,
}
-
return info_dict
'title': 'Weitere Evakuierungen um Vulkan Calbuco',
'description': 'md5:d689c959bdbcf04efeddedbf2299d633',
'duration': 68.197,
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'upload_date': '20150425',
},
}
float_or_none,
parse_duration,
str_to_int,
+ urlencode_postdata,
)
'ext': 'flv',
'title': '頭を撫でてくれる?',
'description': '頭を撫でてくれる?',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'duration': 39,
'upload_date': '20151218',
'uploader': 'カワイイ動物まとめ',
r'^v(\d+)[Uu]rl$', format_id, 'height', default=None)
if not height:
continue
+
+ play_url = self._download_json(
+ 'http://m.pandora.tv/?c=api&m=play_url', video_id,
+ data=urlencode_postdata({
+ 'prgid': video_id,
+ 'runtime': info.get('runtime'),
+ 'vod_url': format_url,
+ }),
+ headers={
+ 'Origin': url,
+ 'Content-Type': 'application/x-www-form-urlencoded',
+ })
+ format_url = play_url.get('url')
+ if not format_url:
+ continue
+
formats.append({
'format_id': '%sp' % height,
'url': format_url,
'title': 'Great Performances - Dudamel Conducts Verdi Requiem at the Hollywood Bowl - Full',
'description': 'md5:657897370e09e2bc6bf0f8d2cd313c6b',
'duration': 6559,
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
},
},
{
'description': 'md5:c741d14e979fc53228c575894094f157',
'title': 'NOVA - Killer Typhoon',
'duration': 3172,
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'upload_date': '20140122',
'age_limit': 10,
},
'title': 'American Experience - Death and the Civil War, Chapter 1',
'description': 'md5:67fa89a9402e2ee7d08f53b920674c18',
'duration': 682,
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
},
'params': {
'skip_download': True, # requires ffmpeg
'title': 'FRONTLINE - United States of Secrets (Part One)',
'description': 'md5:55756bd5c551519cc4b7703e373e217e',
'duration': 6851,
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
},
},
{
'title': "A Chef's Life - Season 3, Ep. 5: Prickly Business",
'description': 'md5:c0ff7475a4b70261c7e58f493c2792a5',
'duration': 1480,
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
},
},
{
'title': 'FRONTLINE - The Atomic Artists',
'description': 'md5:f677e4520cfacb4a5ce1471e31b57800',
'duration': 723,
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
},
'params': {
'skip_download': True, # requires ffmpeg
'ext': 'mp4',
'title': 'FRONTLINE - Netanyahu at War',
'duration': 6852,
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'formats': 'mincount:8',
},
},
410: 'This video has expired and is no longer available for online streaming.',
}
+ def _real_initialize(self):
+ cookie = (self._download_json(
+ 'http://localization.services.pbs.org/localize/auto/cookie/',
+ None, headers=self.geo_verification_headers(), fatal=False) or {}).get('cookie')
+ if cookie:
+ station = self._search_regex(r'#?s=\["([^"]+)"', cookie, 'station')
+ if station:
+ self._set_cookie('.pbs.org', 'pbsol.station', station)
+
def _extract_webpage(self, url):
mobj = re.match(self._VALID_URL, url)
redirect_info = self._download_json(
'%s?format=json' % redirect['url'], display_id,
- 'Downloading %s video url info' % (redirect_id or num))
+ 'Downloading %s video url info' % (redirect_id or num),
+ headers=self.geo_verification_headers())
if redirect_info['status'] == 'error':
raise ExtractorError(
# Try turning it to 'program - title' naming scheme if possible
alt_title = info.get('program', {}).get('title')
if alt_title:
- info['title'] = alt_title + ' - ' + re.sub(r'^' + alt_title + '[\s\-:]+', '', info['title'])
+ info['title'] = alt_title + ' - ' + re.sub(r'^' + alt_title + r'[\s\-:]+', '', info['title'])
description = info.get('description') or info.get(
'program', {}).get('description') or description
'ext': 'mp4',
'title': 'Astronaut Love Triangle Victim Speaks Out: “The Crime in 2007 Hasn’t Defined Us”',
'description': 'Colleen Shipman speaks to PEOPLE for the first time about life after the attack',
- 'thumbnail': 're:^https?://.*\.jpg',
+ 'thumbnail': r're:^https?://.*\.jpg',
'duration': 246.318,
'timestamp': 1458720585,
'upload_date': '20160323',
from __future__ import unicode_literals
-from .zdf import ZDFIE
+from .dreisat import DreiSatIE
-class PhoenixIE(ZDFIE):
+class PhoenixIE(DreiSatIE):
IE_NAME = 'phoenix.de'
_VALID_URL = r'''(?x)https?://(?:www\.)?phoenix\.de/content/
(?:
--- /dev/null
+# coding: utf-8
+from __future__ import unicode_literals
+
+import re
+
+from .common import InfoExtractor
+from ..compat import compat_str
+from ..utils import (
+ ExtractorError,
+ dict_get,
+ int_or_none,
+ unescapeHTML,
+ parse_iso8601,
+)
+
+
+class PikselIE(InfoExtractor):
+ _VALID_URL = r'https?://player\.piksel\.com/v/(?P<id>[a-z0-9]+)'
+ _TESTS = [
+ {
+ 'url': 'http://player.piksel.com/v/nv60p12f',
+ 'md5': 'd9c17bbe9c3386344f9cfd32fad8d235',
+ 'info_dict': {
+ 'id': 'nv60p12f',
+ 'ext': 'mp4',
+ 'title': 'فن الحياة - الحلقة 1',
+ 'description': 'احدث برامج الداعية الاسلامي " مصطفي حسني " فى رمضان 2016علي النهار نور',
+ 'timestamp': 1465231790,
+ 'upload_date': '20160606',
+ }
+ },
+ {
+ # Original source: http://www.uscourts.gov/cameras-courts/state-washington-vs-donald-j-trump-et-al
+ 'url': 'https://player.piksel.com/v/v80kqp41',
+ 'md5': '753ddcd8cc8e4fa2dda4b7be0e77744d',
+ 'info_dict': {
+ 'id': 'v80kqp41',
+ 'ext': 'mp4',
+ 'title': 'WAW- State of Washington vs. Donald J. Trump, et al',
+ 'description': 'State of Washington vs. Donald J. Trump, et al, Case Number 17-CV-00141-JLR, TRO Hearing, Civil Rights Case, 02/3/2017, 1:00 PM (PST), Seattle Federal Courthouse, Seattle, WA, Judge James L. Robart presiding.',
+ 'timestamp': 1486171129,
+ 'upload_date': '20170204',
+ }
+ }
+ ]
+
+ @staticmethod
+ def _extract_url(webpage):
+ mobj = re.search(
+ r'<iframe[^>]+src=["\'](?P<url>(?:https?:)?//player\.piksel\.com/v/[a-z0-9]+)',
+ webpage)
+ if mobj:
+ return mobj.group('url')
+
+ def _real_extract(self, url):
+ video_id = self._match_id(url)
+ webpage = self._download_webpage(url, video_id)
+ app_token = self._search_regex([
+ r'clientAPI\s*:\s*"([^"]+)"',
+ r'data-de-api-key\s*=\s*"([^"]+)"'
+ ], webpage, 'app token')
+ response = self._download_json(
+ 'http://player.piksel.com/ws/ws_program/api/%s/mode/json/apiv/5' % app_token,
+ video_id, query={
+ 'v': video_id
+ })['response']
+ failure = response.get('failure')
+ if failure:
+ raise ExtractorError(response['failure']['reason'], expected=True)
+ video_data = response['WsProgramResponse']['program']['asset']
+ title = video_data['title']
+
+ formats = []
+
+ m3u8_url = dict_get(video_data, [
+ 'm3u8iPadURL',
+ 'ipadM3u8Url',
+ 'm3u8AndroidURL',
+ 'm3u8iPhoneURL',
+ 'iphoneM3u8Url'])
+ if m3u8_url:
+ formats.extend(self._extract_m3u8_formats(
+ m3u8_url, video_id, 'mp4', 'm3u8_native',
+ m3u8_id='hls', fatal=False))
+
+ asset_type = dict_get(video_data, ['assetType', 'asset_type'])
+ for asset_file in video_data.get('assetFiles', []):
+ # TODO: extract rtmp formats
+ http_url = asset_file.get('http_url')
+ if not http_url:
+ continue
+ tbr = None
+ vbr = int_or_none(asset_file.get('videoBitrate'), 1024)
+ abr = int_or_none(asset_file.get('audioBitrate'), 1024)
+ if asset_type == 'video':
+ tbr = vbr + abr
+ elif asset_type == 'audio':
+ tbr = abr
+
+ format_id = ['http']
+ if tbr:
+ format_id.append(compat_str(tbr))
+
+ formats.append({
+ 'format_id': '-'.join(format_id),
+ 'url': unescapeHTML(http_url),
+ 'vbr': vbr,
+ 'abr': abr,
+ 'width': int_or_none(asset_file.get('videoWidth')),
+ 'height': int_or_none(asset_file.get('videoHeight')),
+ 'filesize': int_or_none(asset_file.get('filesize')),
+ 'tbr': tbr,
+ })
+ self._sort_formats(formats)
+
+ return {
+ 'id': video_id,
+ 'title': title,
+ 'description': video_data.get('description'),
+ 'thumbnail': video_data.get('thumbnailUrl'),
+ 'timestamp': parse_iso8601(video_data.get('dateadd')),
+ 'formats': formats,
+ }
'ext': 'mp4',
'title': 'Brandon Semenuk - RAW 100',
'description': 'Official release: www.redbull.ca/rupertwalker',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'duration': 100,
'upload_date': '20150406',
'uploader': 'revelco',
'ext': 'mp4',
'title': 'Тайны перевала Дятлова • 1 серия 2 часть',
'description': 'Документальный сериал-расследование одной из самых жутких тайн ХХ века',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'duration': 694,
'age_limit': 0,
},
'ext': 'mp4',
'title': 'Vyžeňte vosy a sršně ze zahrady',
'description': 'md5:f93d398691044d303bc4a3de62f3e976',
- 'thumbnail': 're:(?i)^https?://.*\.(?:jpg|png)$',
+ 'thumbnail': r're:(?i)^https?://.*\.(?:jpg|png)$',
'duration': 279,
'timestamp': 1438732860,
'upload_date': '20150805',
'ext': 'flv',
'title': 're:^Přímý přenos iDNES.cz [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}$',
'description': 'Sledujte provoz na ranveji Letiště Václava Havla v Praze',
- 'thumbnail': 're:(?i)^https?://.*\.(?:jpg|png)$',
+ 'thumbnail': r're:(?i)^https?://.*\.(?:jpg|png)$',
'is_live': True,
},
'params': {
'ext': 'mp4',
'title': 'Zavřeli jsme mraženou pizzu do auta. Upekla se',
'description': 'md5:01e73f02329e2e5760bd5eed4d42e3c2',
- 'thumbnail': 're:(?i)^https?://.*\.(?:jpg|png)$',
+ 'thumbnail': r're:(?i)^https?://.*\.(?:jpg|png)$',
'duration': 39,
'timestamp': 1438969140,
'upload_date': '20150807',
'ext': 'mp4',
'title': 'Táhni! Demonstrace proti imigrantům budila emoce',
'description': 'md5:97c81d589a9491fbfa323c9fa3cca72c',
- 'thumbnail': 're:(?i)^https?://.*\.(?:jpg|png)$',
+ 'thumbnail': r're:(?i)^https?://.*\.(?:jpg|png)$',
'timestamp': 1439052180,
'upload_date': '20150808',
'is_live': False,
'ext': 'mp4',
'title': 'Recesisté udělali z billboardu kolotoč',
'description': 'md5:7369926049588c3989a66c9c1a043c4c',
- 'thumbnail': 're:(?i)^https?://.*\.(?:jpg|png)$',
+ 'thumbnail': r're:(?i)^https?://.*\.(?:jpg|png)$',
'timestamp': 1415725500,
'upload_date': '20141111',
'is_live': False,
'ext': 'mp4',
'title': 'Ellen Euro Cutie Blond Takes a Sexy Survey Get Facial in The Park',
'age_limit': 18,
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
},
}]
'id': '3353705',
'ext': 'mp4',
'title': 'S04_RM_UCL_Rus',
- 'thumbnail': 're:^https?://.*\.png$',
+ 'thumbnail': r're:^https?://.*\.png$',
'duration': 145.94,
},
}, {
display_id = '%s-%s' % (name, clip_id)
- parsed_url = compat_urlparse.urlparse(url)
-
- payload_url = compat_urlparse.urlunparse(parsed_url._replace(
- netloc='app.pluralsight.com', path='player/api/v1/payload'))
-
course = self._download_json(
- payload_url, display_id, headers={'Referer': url})['payload']['course']
+ 'https://app.pluralsight.com/player/user/api/v1/player/payload',
+ display_id, data=urlencode_postdata({'courseId': course_name}),
+ headers={'Referer': url})
collection = course['modules']
'timestamp': 1456594200,
'upload_date': '20160227',
'duration': 2364,
- 'thumbnail': 're:^https?://static\.prsa\.pl/images/.*\.jpg$'
+ 'thumbnail': r're:^https?://static\.prsa\.pl/images/.*\.jpg$'
},
}],
}, {
'display_id': 'teen-grabs-a-dildo-and-fucks-her-pussy-live-on-1hottie-i-rec',
'ext': 'mp4',
'title': 'Teen grabs a dildo and fucks her pussy live on 1hottie, I rec',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'duration': 551,
'view_count': int,
'age_limit': 18,
--- /dev/null
+# coding: utf-8
+from __future__ import unicode_literals
+
+from .common import InfoExtractor
+from ..compat import (
+ compat_parse_qs,
+ compat_str,
+)
+from ..utils import (
+ int_or_none,
+ try_get,
+ unified_timestamp,
+)
+
+
+class PornFlipIE(InfoExtractor):
+ _VALID_URL = r'https?://(?:www\.)?pornflip\.com/(?:v|embed)/(?P<id>[0-9A-Za-z]{11})'
+ _TESTS = [{
+ 'url': 'https://www.pornflip.com/v/wz7DfNhMmep',
+ 'md5': '98c46639849145ae1fd77af532a9278c',
+ 'info_dict': {
+ 'id': 'wz7DfNhMmep',
+ 'ext': 'mp4',
+ 'title': '2 Amateurs swallow make his dream cumshots true',
+ 'thumbnail': r're:^https?://.*\.jpg$',
+ 'duration': 112,
+ 'timestamp': 1481655502,
+ 'upload_date': '20161213',
+ 'uploader_id': '106786',
+ 'uploader': 'figifoto',
+ 'view_count': int,
+ 'age_limit': 18,
+ }
+ }, {
+ 'url': 'https://www.pornflip.com/embed/wz7DfNhMmep',
+ 'only_matching': True,
+ }]
+
+ def _real_extract(self, url):
+ video_id = self._match_id(url)
+
+ webpage = self._download_webpage(
+ 'https://www.pornflip.com/v/%s' % video_id, video_id)
+
+ flashvars = compat_parse_qs(self._search_regex(
+ r'<embed[^>]+flashvars=(["\'])(?P<flashvars>(?:(?!\1).)+)\1',
+ webpage, 'flashvars', group='flashvars'))
+
+ title = flashvars['video_vars[title]'][0]
+
+ def flashvar(kind):
+ return try_get(
+ flashvars, lambda x: x['video_vars[%s]' % kind][0], compat_str)
+
+ formats = []
+ for key, value in flashvars.items():
+ if not (value and isinstance(value, list)):
+ continue
+ format_url = value[0]
+ if key == 'video_vars[hds_manifest]':
+ formats.extend(self._extract_mpd_formats(
+ format_url, video_id, mpd_id='dash', fatal=False))
+ continue
+ height = self._search_regex(
+ r'video_vars\[video_urls\]\[(\d+)', key, 'height', default=None)
+ if not height:
+ continue
+ formats.append({
+ 'url': format_url,
+ 'format_id': 'http-%s' % height,
+ 'height': int_or_none(height),
+ })
+ self._sort_formats(formats)
+
+ uploader = self._html_search_regex(
+ (r'<span[^>]+class="name"[^>]*>\s*<a[^>]+>\s*<strong>(?P<uploader>[^<]+)',
+ r'<meta[^>]+content=(["\'])[^>]*\buploaded by (?P<uploader>.+?)\1'),
+ webpage, 'uploader', fatal=False, group='uploader')
+
+ return {
+ 'id': video_id,
+ 'formats': formats,
+ 'title': title,
+ 'thumbnail': flashvar('big_thumb'),
+ 'duration': int_or_none(flashvar('duration')),
+ 'timestamp': unified_timestamp(self._html_search_meta(
+ 'uploadDate', webpage, 'timestamp')),
+ 'uploader_id': flashvar('author_id'),
+ 'uploader': uploader,
+ 'view_count': int_or_none(flashvar('views')),
+ 'age_limit': 18,
+ }
'ext': 'mp4',
'title': 'Restroom selfie masturbation',
'description': 'md5:3748420395e03e31ac96857a8f125b2b',
- 'thumbnail': 're:^https?://.*\.jpg',
+ 'thumbnail': r're:^https?://.*\.jpg',
'view_count': int,
'age_limit': 18,
}
'ext': 'mp4',
'title': 'Sierra loves doing laundry',
'description': 'md5:8ff0523848ac2b8f9b065ba781ccf294',
- 'thumbnail': 're:^https?://.*\.jpg',
+ 'thumbnail': r're:^https?://.*\.jpg',
'view_count': int,
'age_limit': 18,
},
comment_count = self._extract_count(
r'All Comments\s*<span>\(([\d,.]+)\)', webpage, 'comment')
- video_urls = list(map(compat_urllib_parse_unquote, re.findall(r"player_quality_[0-9]{3}p\s*=\s*'([^']+)'", webpage)))
+ video_urls = []
+ for quote, video_url in re.findall(
+ r'player_quality_[0-9]{3,4}p\s*=\s*(["\'])(.+?)\1;', webpage):
+ video_urls.append(compat_urllib_parse_unquote(re.sub(
+ r'{0}\s*\+\s*{0}'.format(quote), '', video_url)))
+
if webpage.find('"encrypted":true') != -1:
password = compat_urllib_parse_unquote_plus(
self._search_regex(r'"video_title":"([^"]+)', webpage, 'password'))
webpage = self._download_webpage(url, playlist_id)
- entries = self._extract_entries(webpage)
+ # Only process container div with main playlist content skipping
+ # drop-down menu that uses similar pattern for videos (see
+ # https://github.com/rg3/youtube-dl/issues/11594).
+ container = self._search_regex(
+ r'(?s)(<div[^>]+class=["\']container.+)', webpage,
+ 'container', default=webpage)
+
+ entries = self._extract_entries(container)
playlist = self._parse_json(
self._search_regex(
class PornHubPlaylistIE(PornHubPlaylistBaseIE):
_VALID_URL = r'https?://(?:www\.)?pornhub\.com/playlist/(?P<id>\d+)'
_TESTS = [{
- 'url': 'http://www.pornhub.com/playlist/6201671',
+ 'url': 'http://www.pornhub.com/playlist/4667351',
'info_dict': {
- 'id': '6201671',
- 'title': 'P0p4',
+ 'id': '4667351',
+ 'title': 'Nataly Hot',
},
- 'playlist_mincount': 35,
+ 'playlist_mincount': 2,
}]
'description': 'md5:a8304bef7ef06cb4ab476ca6029b01b0',
'categories': ['Adult Humor', 'Blondes'],
'uploader': 'Alpha Blue Archives',
- 'thumbnail': 're:^https?://.*\\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'timestamp': 1417582800,
'age_limit': 18,
}
'ext': 'mp4',
'title': 'Recherche appartement',
'description': 'md5:fe10cb92ae2dd3ed94bb4080d11ff493',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'upload_date': '20140925',
'duration': 120,
'view_count': int,
'display_id': 'striptease-from-sexy-secretary',
'description': 'md5:0ee35252b685b3883f4a1d38332f9980',
'categories': list, # NSFW
- 'thumbnail': 're:https?://.*\.jpg$',
+ 'thumbnail': r're:https?://.*\.jpg$',
'age_limit': 18,
}
}
'ext': 'mp4',
'title': 'Organic mattresses used to clean waste water',
'upload_date': '20160409',
- 'thumbnail': 're:^https?://.*\.jpg',
+ 'thumbnail': r're:^https?://.*\.jpg',
'description': 'md5:20002e654bbafb6908395a5c0cfcd125'
}
}
'id': '86D1CE8462-576CAAE416',
'ext': 'mp4',
'title': 'oceans.mp4',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
}
}
formats.extend(self._extract_m3u8_formats(
source_url, clip_id, 'mp4', 'm3u8_native',
m3u8_id='hls', fatal=False))
+ elif mimetype == 'application/dash+xml':
+ formats.extend(self._extract_mpd_formats(
+ source_url, clip_id, mpd_id='dash', fatal=False))
else:
tbr = fix_bitrate(source['bitrate'])
if protocol in ('rtmp', 'rtmpe'):
'url': 'http://www.prosieben.de/tv/circus-halligalli/videos/218-staffel-2-episode-18-jahresrueckblick-ganze-folge',
'info_dict': {
'id': '2104602',
- 'ext': 'flv',
+ 'ext': 'mp4',
'title': 'Episode 18 - Staffel 2',
'description': 'md5:8733c81b702ea472e069bc48bb658fc1',
'upload_date': '20131231',
'duration': 5845.04,
},
- 'params': {
- # rtmp download
- 'skip_download': True,
- },
},
{
'url': 'http://www.prosieben.de/videokatalog/Gesellschaft/Leben/Trends/video-Lady-Umstyling-f%C3%BCr-Audrina-Rebekka-Audrina-Fergen-billig-aussehen-Battal-Modica-700544.html',
'url': 'http://www.the-voice-of-germany.de/video/31-andreas-kuemmert-rocket-man-clip',
'info_dict': {
'id': '2572814',
- 'ext': 'flv',
+ 'ext': 'mp4',
'title': 'Andreas Kümmert: Rocket Man',
'description': 'md5:6ddb02b0781c6adf778afea606652e38',
'upload_date': '20131017',
'url': 'http://www.fem.com/wellness/videos/wellness-video-clip-kurztripps-zum-valentinstag.html',
'info_dict': {
'id': '2156342',
- 'ext': 'flv',
+ 'ext': 'mp4',
'title': 'Kurztrips zum Valentinstag',
'description': 'Romantischer Kurztrip zum Valentinstag? Nina Heinemann verrät, was sich hier wirklich lohnt.',
'duration': 307.24,
'description': 'md5:63b8963e71f481782aeea877658dec84',
},
'playlist_count': 2,
+ 'skip': 'This video is unavailable',
},
{
'url': 'http://www.7tv.de/circus-halligalli/615-best-of-circus-halligalli-ganze-folge',
'info_dict': {
'id': '4187506',
- 'ext': 'flv',
+ 'ext': 'mp4',
'title': 'Best of Circus HalliGalli',
'description': 'md5:8849752efd90b9772c9db6fdf87fb9e9',
'upload_date': '20151229',
title = self._html_search_regex(self._TITLE_REGEXES, webpage, 'title')
info = self._extract_video_info(url, clip_id)
description = self._html_search_regex(
- self._DESCRIPTION_REGEXES, webpage, 'description', fatal=False)
+ self._DESCRIPTION_REGEXES, webpage, 'description', default=None)
+ if description is None:
+ description = self._og_search_description(webpage)
thumbnail = self._og_search_thumbnail(webpage)
upload_date = unified_strdate(self._html_search_regex(
self._UPLOAD_DATE_REGEXES, webpage, 'upload date', default=None))
self._PLAYLIST_ID_REGEXES, webpage, 'playlist id')
playlist = self._parse_json(
self._search_regex(
- 'var\s+contentResources\s*=\s*(\[.+?\]);\s*</script',
+ r'var\s+contentResources\s*=\s*(\[.+?\]);\s*</script',
webpage, 'playlist'),
playlist_id)
entries = []
'release_date': '20141227',
'creator': '林俊杰',
'description': 'md5:d327722d0361576fde558f1ac68a7065',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
}
}, {
'note': 'There is no mp3-320 version of this song.',
'release_date': '20050626',
'creator': '李季美',
'description': 'md5:46857d5ed62bc4ba84607a805dccf437',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
}
}, {
'note': 'lyrics not in .lrc format',
'release_date': '19970225',
'creator': 'Dark Funeral',
'description': 'md5:ed14d5bd7ecec19609108052c25b2c11',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
},
'params': {
'skip_download': True,
'ext': 'mp4',
'title': 'Policiais humilham suspeito à beira da morte: "Morre com dignidade"',
'description': 'md5:01812008664be76a6479aa58ec865b72',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'duration': 98,
'like_count': int,
'view_count': int,
'duration': 178,
'width': 512,
'title': 'Druck auf Patrick Öztürk',
- 'thumbnail': 're:https?://.*\.jpg$',
+ 'thumbnail': r're:https?://.*\.jpg$',
'description': 'Gegen den SPD-Bürgerschaftsabgeordneten Patrick Öztürk wird wegen Beihilfe zum gewerbsmäßigen Betrug ermittelt. Am Donnerstagabend sollte er dem Vorstand des SPD-Unterbezirks Bremerhaven dazu Rede und Antwort stehen.',
},
}
raise ExtractorError('This video is DRM protected.', expected=True)
device_types = ['ipad']
- if app_code != 'toutv':
- device_types.append('flash')
if not smuggled_data:
+ device_types.append('flash')
device_types.append('android')
formats = []
continue
f_url = re.sub(r'\d+\.%s' % ext, '%d.%s' % (tbr, ext), v_url)
protocol = determine_protocol({'url': f_url})
- formats.append({
+ f = {
'format_id': '%s-%d' % (protocol, tbr),
'url': f_url,
'ext': 'flv' if protocol == 'rtmp' else ext,
'width': int_or_none(url_e.get('width')),
'height': int_or_none(url_e.get('height')),
'tbr': tbr,
- })
+ }
+ mobj = re.match(r'(?P<url>rtmp://[^/]+/[^/]+)/(?P<playpath>[^?]+)(?P<auth>\?.+)', f_url)
+ if mobj:
+ f.update({
+ 'url': mobj.group('url') + mobj.group('auth'),
+ 'play_path': mobj.group('playpath'),
+ })
+ formats.append(f)
if protocol == 'rtsp':
base_url = self._search_regex(
r'rtsp://([^?]+)', f_url, 'base url', default=None)
'ext': 'mp3',
'title': 're:^NDR 2 [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}$',
'description': 'md5:591c49c702db1a33751625ebfb67f273',
- 'thumbnail': 're:^https?://.*\.png',
+ 'thumbnail': r're:^https?://.*\.png',
'is_live': True,
},
'params': {
'id': 'chaartaar-ashoobam',
'ext': 'mp4',
'title': 'Chaartaar - Ashoobam',
- 'thumbnail': 're:^https?://.*\.jpe?g$',
+ 'thumbnail': r're:^https?://.*\.jpe?g$',
'upload_date': '20150215',
'view_count': int,
'like_count': int,
'description': 'md5:f27c544694cacb46a078db84ec35d2d9',
'upload_date': '20140407',
'duration': 6160,
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
}
},
{
'title': 'TG PRIMO TEMPO',
'upload_date': '20140612',
'duration': 1758,
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
},
'skip': 'Geo-restricted to Italy',
},
'description': 'md5:364b604f7db50594678f483353164fb8',
'upload_date': '20140923',
'duration': 386,
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
}
},
]
'ext': 'mp3',
'title': 'Main Stage - Ford & Lopatin',
'description': 'md5:4f340fb48426423530af5a9d87bd7b91',
- 'thumbnail': 're:^https?://.*\.jpg',
+ 'thumbnail': r're:^https?://.*\.jpg',
'duration': 2452,
'timestamp': 1307103164,
'upload_date': '20110603',
webpage, 'video data'))
def get_json_value(key, fatal=False):
- return self._search_regex('"%s"\s*:\s*"([^"]+)"' % key, video_data, key, fatal=fatal)
+ return self._search_regex(r'"%s"\s*:\s*"([^"]+)"' % key, video_data, key, fatal=fatal)
title = unescapeHTML(get_json_value('title', fatal=True))
mmid, fid = re.search(r',/(\d+)\?f=(\d+)', get_json_value('flv', fatal=True)).groups()
'title': 'MONA LISA',
'uploader': 'ALKILADOS',
'uploader_id': '216429',
- 'thumbnail': 're:^https?://.*\.jpg',
+ 'thumbnail': r're:^https?://.*\.jpg',
},
}]
'id': 'LYV6doKo7f',
'ext': 'mp4',
'title': 'Luati-le Banii sez 4 ep 1',
- 'description': 're:^Iata-ne reveniti dupa o binemeritata vacanta\. +Va astept si pe Facebook cu pareri si comentarii.$',
+ 'description': r're:^Iata-ne reveniti dupa o binemeritata vacanta\. +Va astept si pe Facebook cu pareri si comentarii.$',
}
}
'ext': 'mp4',
'title': 'Further Adventures in Finance and Felony Trailer',
'description': 'md5:6d31f55f30cb101b5476c4a379e324a3',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'timestamp': 1464876000,
'upload_date': '20160602',
}
'ext': 'mp4',
'title': 'Million Dollars, But...: Million Dollars, But... The Game Announcement',
'description': 'md5:0cc3b21986d54ed815f5faeccd9a9ca5',
- 'thumbnail': 're:^https?://.*\.png$',
+ 'thumbnail': r're:^https?://.*\.png$',
'series': 'Million Dollars, But...',
'episode': 'Million Dollars, But... The Game Announcement',
'comment_count': int,
'ext': 'mp4',
'title': 'Toy Story 3',
'description': 'From the creators of the beloved TOY STORY films, comes a story that will reunite the gang in a whole new way.',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
},
}
import re
from .common import InfoExtractor
+from ..compat import compat_HTTPError
from ..utils import (
float_or_none,
parse_iso8601,
unescapeHTML,
+ ExtractorError,
)
-class RteIE(InfoExtractor):
- IE_NAME = 'rte'
- IE_DESC = 'Raidió Teilifís Éireann TV'
- _VALID_URL = r'https?://(?:www\.)?rte\.ie/player/[^/]{2,3}/show/[^/]+/(?P<id>[0-9]+)'
- _TEST = {
- 'url': 'http://www.rte.ie/player/ie/show/iwitness-862/10478715/',
- 'info_dict': {
- 'id': '10478715',
- 'ext': 'flv',
- 'title': 'Watch iWitness online',
- 'thumbnail': 're:^https?://.*\.jpg$',
- 'description': 'iWitness : The spirit of Ireland, one voice and one minute at a time.',
- 'duration': 60.046,
- },
- 'params': {
- 'skip_download': 'f4m fails with --test atm'
- }
- }
-
- def _real_extract(self, url):
- video_id = self._match_id(url)
- webpage = self._download_webpage(url, video_id)
-
- title = self._og_search_title(webpage)
- description = self._html_search_meta('description', webpage, 'description')
- duration = float_or_none(self._html_search_meta(
- 'duration', webpage, 'duration', fatal=False), 1000)
-
- thumbnail = None
- thumbnail_meta = self._html_search_meta('thumbnail', webpage)
- if thumbnail_meta:
- thumbnail_id = self._search_regex(
- r'uri:irus:(.+)', thumbnail_meta,
- 'thumbnail id', fatal=False)
- if thumbnail_id:
- thumbnail = 'http://img.rasset.ie/%s.jpg' % thumbnail_id
-
- feeds_url = self._html_search_meta('feeds-prefix', webpage, 'feeds url') + video_id
- json_string = self._download_json(feeds_url, video_id)
-
- # f4m_url = server + relative_url
- f4m_url = json_string['shows'][0]['media:group'][0]['rte:server'] + json_string['shows'][0]['media:group'][0]['url']
- f4m_formats = self._extract_f4m_formats(f4m_url, video_id)
- self._sort_formats(f4m_formats)
-
- return {
- 'id': video_id,
- 'title': title,
- 'formats': f4m_formats,
- 'description': description,
- 'thumbnail': thumbnail,
- 'duration': duration,
- }
-
-
-class RteRadioIE(InfoExtractor):
- IE_NAME = 'rte:radio'
- IE_DESC = 'Raidió Teilifís Éireann radio'
- # Radioplayer URLs have two distinct specifier formats,
- # the old format #!rii=<channel_id>:<id>:<playable_item_id>:<date>:
- # the new format #!rii=b<channel_id>_<id>_<playable_item_id>_<date>_
- # where the IDs are int/empty, the date is DD-MM-YYYY, and the specifier may be truncated.
- # An <id> uniquely defines an individual recording, and is the only part we require.
- _VALID_URL = r'https?://(?:www\.)?rte\.ie/radio/utils/radioplayer/rteradioweb\.html#!rii=(?:b?[0-9]*)(?:%3A|:|%5F|_)(?P<id>[0-9]+)'
-
- _TESTS = [{
- # Old-style player URL; HLS and RTMPE formats
- 'url': 'http://www.rte.ie/radio/utils/radioplayer/rteradioweb.html#!rii=16:10507902:2414:27-12-2015:',
- 'info_dict': {
- 'id': '10507902',
- 'ext': 'mp4',
- 'title': 'Gloria',
- 'thumbnail': 're:^https?://.*\.jpg$',
- 'description': 'md5:9ce124a7fb41559ec68f06387cabddf0',
- 'timestamp': 1451203200,
- 'upload_date': '20151227',
- 'duration': 7230.0,
- },
- 'params': {
- 'skip_download': 'f4m fails with --test atm'
- }
- }, {
- # New-style player URL; RTMPE formats only
- 'url': 'http://rte.ie/radio/utils/radioplayer/rteradioweb.html#!rii=b16_3250678_8861_06-04-2012_',
- 'info_dict': {
- 'id': '3250678',
- 'ext': 'flv',
- 'title': 'The Lyric Concert with Paul Herriott',
- 'thumbnail': 're:^https?://.*\.jpg$',
- 'description': '',
- 'timestamp': 1333742400,
- 'upload_date': '20120406',
- 'duration': 7199.016,
- },
- 'params': {
- 'skip_download': 'f4m fails with --test atm'
- }
- }]
-
+class RteBaseIE(InfoExtractor):
def _real_extract(self, url):
item_id = self._match_id(url)
- json_string = self._download_json(
- 'http://www.rte.ie/rteavgen/getplaylist/?type=web&format=json&id=' + item_id,
- item_id)
+ try:
+ json_string = self._download_json(
+ 'http://www.rte.ie/rteavgen/getplaylist/?type=web&format=json&id=' + item_id,
+ item_id)
+ except ExtractorError as ee:
+ if isinstance(ee.cause, compat_HTTPError) and ee.cause.code == 404:
+ error_info = self._parse_json(ee.cause.read().decode(), item_id, fatal=False)
+ if error_info:
+ raise ExtractorError(
+ '%s said: %s' % (self.IE_NAME, error_info['message']),
+ expected=True)
+ raise
# NB the string values in the JSON are stored using XML escaping(!)
show = json_string['shows'][0]
'duration': duration,
'formats': formats,
}
+
+
+class RteIE(RteBaseIE):
+ IE_NAME = 'rte'
+ IE_DESC = 'Raidió Teilifís Éireann TV'
+ _VALID_URL = r'https?://(?:www\.)?rte\.ie/player/[^/]{2,3}/show/[^/]+/(?P<id>[0-9]+)'
+ _TEST = {
+ 'url': 'http://www.rte.ie/player/ie/show/iwitness-862/10478715/',
+ 'md5': '4a76eb3396d98f697e6e8110563d2604',
+ 'info_dict': {
+ 'id': '10478715',
+ 'ext': 'mp4',
+ 'title': 'iWitness',
+ 'thumbnail': r're:^https?://.*\.jpg$',
+ 'description': 'The spirit of Ireland, one voice and one minute at a time.',
+ 'duration': 60.046,
+ 'upload_date': '20151012',
+ 'timestamp': 1444694160,
+ },
+ }
+
+
+class RteRadioIE(RteBaseIE):
+ IE_NAME = 'rte:radio'
+ IE_DESC = 'Raidió Teilifís Éireann radio'
+ # Radioplayer URLs have two distinct specifier formats,
+ # the old format #!rii=<channel_id>:<id>:<playable_item_id>:<date>:
+ # the new format #!rii=b<channel_id>_<id>_<playable_item_id>_<date>_
+ # where the IDs are int/empty, the date is DD-MM-YYYY, and the specifier may be truncated.
+ # An <id> uniquely defines an individual recording, and is the only part we require.
+ _VALID_URL = r'https?://(?:www\.)?rte\.ie/radio/utils/radioplayer/rteradioweb\.html#!rii=(?:b?[0-9]*)(?:%3A|:|%5F|_)(?P<id>[0-9]+)'
+
+ _TESTS = [{
+ # Old-style player URL; HLS and RTMPE formats
+ 'url': 'http://www.rte.ie/radio/utils/radioplayer/rteradioweb.html#!rii=16:10507902:2414:27-12-2015:',
+ 'md5': 'c79ccb2c195998440065456b69760411',
+ 'info_dict': {
+ 'id': '10507902',
+ 'ext': 'mp4',
+ 'title': 'Gloria',
+ 'thumbnail': r're:^https?://.*\.jpg$',
+ 'description': 'md5:9ce124a7fb41559ec68f06387cabddf0',
+ 'timestamp': 1451203200,
+ 'upload_date': '20151227',
+ 'duration': 7230.0,
+ },
+ }, {
+ # New-style player URL; RTMPE formats only
+ 'url': 'http://rte.ie/radio/utils/radioplayer/rteradioweb.html#!rii=b16_3250678_8861_06-04-2012_',
+ 'info_dict': {
+ 'id': '3250678',
+ 'ext': 'flv',
+ 'title': 'The Lyric Concert with Paul Herriott',
+ 'thumbnail': r're:^https?://.*\.jpg$',
+ 'description': '',
+ 'timestamp': 1333742400,
+ 'upload_date': '20120406',
+ 'duration': 7199.016,
+ },
+ 'params': {
+ # rtmp download
+ 'skip_download': True,
+ },
+ }]
from __future__ import unicode_literals
import re
+
from .common import InfoExtractor
+from ..utils import int_or_none
class RTL2IE(InfoExtractor):
'id': 'folge-203-0',
'ext': 'f4v',
'title': 'GRIP sucht den Sommerkönig',
- 'description': 'Matthias, Det und Helge treten gegeneinander an.'
+ 'description': 'md5:e3adbb940fd3c6e76fa341b8748b562f'
},
'params': {
# rtmp download
'id': '21040-anna-erwischt-alex',
'ext': 'mp4',
'title': 'Anna erwischt Alex!',
- 'description': 'Anna ist Alex\' Tochter bei Köln 50667.'
+ 'description': 'Anna nimmt ihrem Vater nicht ab, dass er nicht spielt. Und tatsächlich erwischt sie ihn auf frischer Tat.'
},
'params': {
# rtmp download
r'vico_id\s*:\s*([0-9]+)', webpage, 'vico_id')
vivi_id = self._html_search_regex(
r'vivi_id\s*:\s*([0-9]+)', webpage, 'vivi_id')
- info_url = 'http://www.rtl2.de/video/php/get_video.php?vico_id=' + vico_id + '&vivi_id=' + vivi_id
- info = self._download_json(info_url, video_id)
+ info = self._download_json(
+ 'http://www.rtl2.de/sites/default/modules/rtl2/mediathek/php/get_video_jw.php',
+ video_id, query={
+ 'vico_id': vico_id,
+ 'vivi_id': vivi_id,
+ })
video_info = info['video']
title = video_info['titel']
- description = video_info.get('beschreibung')
- thumbnail = video_info.get('image')
- download_url = video_info['streamurl']
- download_url = download_url.replace('\\', '')
- stream_url = 'mp4:' + self._html_search_regex(r'ondemand/(.*)', download_url, 'stream URL')
- rtmp_conn = ['S:connect', 'O:1', 'NS:pageUrl:' + url, 'NB:fpad:0', 'NN:videoFunction:1', 'O:0']
+ formats = []
+
+ rtmp_url = video_info.get('streamurl')
+ if rtmp_url:
+ rtmp_url = rtmp_url.replace('\\', '')
+ stream_url = 'mp4:' + self._html_search_regex(r'/ondemand/(.+)', rtmp_url, 'stream URL')
+ rtmp_conn = ['S:connect', 'O:1', 'NS:pageUrl:' + url, 'NB:fpad:0', 'NN:videoFunction:1', 'O:0']
+
+ formats.append({
+ 'format_id': 'rtmp',
+ 'url': rtmp_url,
+ 'play_path': stream_url,
+ 'player_url': 'http://www.rtl2.de/flashplayer/vipo_player.swf',
+ 'page_url': url,
+ 'flash_version': 'LNX 11,2,202,429',
+ 'rtmp_conn': rtmp_conn,
+ 'no_resume': True,
+ 'preference': 1,
+ })
+
+ m3u8_url = video_info.get('streamurl_hls')
+ if m3u8_url:
+ formats.extend(self._extract_akamai_formats(m3u8_url, video_id))
- formats = [{
- 'url': download_url,
- 'play_path': stream_url,
- 'player_url': 'http://www.rtl2.de/flashplayer/vipo_player.swf',
- 'page_url': url,
- 'flash_version': 'LNX 11,2,202,429',
- 'rtmp_conn': rtmp_conn,
- 'no_resume': True,
- }]
self._sort_formats(formats)
return {
'id': video_id,
'title': title,
- 'thumbnail': thumbnail,
- 'description': description,
+ 'thumbnail': video_info.get('image'),
+ 'description': video_info.get('beschreibung'),
+ 'duration': int_or_none(video_info.get('duration')),
'formats': formats,
}
'ext': 'mp4',
'timestamp': 1424039400,
'title': 'RTL Nieuws - Nieuwe beelden Kopenhagen: chaos direct na aanslag',
- 'thumbnail': 're:^https?://screenshots\.rtl\.nl/(?:[^/]+/)*sz=[0-9]+x[0-9]+/uuid=84ae5571-ac25-4225-ae0c-ef8d9efb2aed$',
+ 'thumbnail': r're:^https?://screenshots\.rtl\.nl/(?:[^/]+/)*sz=[0-9]+x[0-9]+/uuid=84ae5571-ac25-4225-ae0c-ef8d9efb2aed$',
'upload_date': '20150215',
'description': 'Er zijn nieuwe beelden vrijgegeven die vlak na de aanslag in Kopenhagen zijn gemaakt. Op de video is goed te zien hoe omstanders zich bekommeren om één van de slachtoffers, terwijl de eerste agenten ter plaatse komen.',
}
'id': 'f536aac0-1dc3-4314-920e-3bd1c5b3811a',
'ext': 'mp4',
'title': 'RTL Nieuws - Meer beelden van overval juwelier',
- 'thumbnail': 're:^https?://screenshots\.rtl\.nl/(?:[^/]+/)*sz=[0-9]+x[0-9]+/uuid=f536aac0-1dc3-4314-920e-3bd1c5b3811a$',
+ 'thumbnail': r're:^https?://screenshots\.rtl\.nl/(?:[^/]+/)*sz=[0-9]+x[0-9]+/uuid=f536aac0-1dc3-4314-920e-3bd1c5b3811a$',
'timestamp': 1437233400,
'upload_date': '20150718',
'duration': 30.474,
'ext': 'mp3',
'title': 'Paixões Cruzadas',
'description': 'As paixões musicais de António Cartaxo e António Macedo',
- 'thumbnail': 're:^https?://.*\.jpg',
+ 'thumbnail': r're:^https?://.*\.jpg',
},
'params': {
# rtmp download
import re
from .srgssr import SRGSSRIE
-from ..compat import (
- compat_str,
- compat_urllib_parse_urlparse,
-)
+from ..compat import compat_str
from ..utils import (
int_or_none,
parse_duration,
parse_iso8601,
unescapeHTML,
- xpath_text,
+ determine_ext,
)
class RTSIE(SRGSSRIE):
IE_DESC = 'RTS.ch'
- _VALID_URL = r'rts:(?P<rts_id>\d+)|https?://(?:www\.)?rts\.ch/(?:[^/]+/){2,}(?P<id>[0-9]+)-(?P<display_id>.+?)\.html'
+ _VALID_URL = r'rts:(?P<rts_id>\d+)|https?://(?:.+?\.)?rts\.ch/(?:[^/]+/){2,}(?P<id>[0-9]+)-(?P<display_id>.+?)\.html'
_TESTS = [
{
'url': 'http://www.rts.ch/archives/tv/divers/3449373-les-enfants-terribles.html',
- 'md5': 'f254c4b26fb1d3c183793d52bc40d3e7',
+ 'md5': 'ff7f8450a90cf58dacb64e29707b4a8e',
'info_dict': {
'id': '3449373',
'display_id': 'les-enfants-terribles',
'uploader': 'Divers',
'upload_date': '19680921',
'timestamp': -40280400,
- 'thumbnail': 're:^https?://.*\.image',
+ 'thumbnail': r're:^https?://.*\.image',
'view_count': int,
},
- 'params': {
- # m3u8 download
- 'skip_download': True,
- }
},
{
'url': 'http://www.rts.ch/emissions/passe-moi-les-jumelles/5624067-entre-ciel-et-mer.html',
- 'md5': 'f1077ac5af686c76528dc8d7c5df29ba',
'info_dict': {
- 'id': '5742494',
- 'display_id': '5742494',
- 'ext': 'mp4',
- 'duration': 3720,
- 'title': 'Les yeux dans les cieux - Mon homard au Canada',
- 'description': 'md5:d22ee46f5cc5bac0912e5a0c6d44a9f7',
- 'uploader': 'Passe-moi les jumelles',
- 'upload_date': '20140404',
- 'timestamp': 1396635300,
- 'thumbnail': 're:^https?://.*\.image',
- 'view_count': int,
+ 'id': '5624065',
+ 'title': 'Passe-moi les jumelles',
},
- 'params': {
- # m3u8 download
- 'skip_download': True,
- }
+ 'playlist_mincount': 4,
},
{
'url': 'http://www.rts.ch/video/sport/hockey/5745975-1-2-kloten-fribourg-5-2-second-but-pour-gotteron-par-kwiatowski.html',
- 'md5': 'b4326fecd3eb64a458ba73c73e91299d',
'info_dict': {
'id': '5745975',
'display_id': '1-2-kloten-fribourg-5-2-second-but-pour-gotteron-par-kwiatowski',
'uploader': 'Hockey',
'upload_date': '20140403',
'timestamp': 1396556882,
- 'thumbnail': 're:^https?://.*\.image',
+ 'thumbnail': r're:^https?://.*\.image',
'view_count': int,
},
+ 'params': {
+ # m3u8 download
+ 'skip_download': True,
+ },
'skip': 'Blocked outside Switzerland',
},
{
'url': 'http://www.rts.ch/video/info/journal-continu/5745356-londres-cachee-par-un-epais-smog.html',
- 'md5': '9f713382f15322181bb366cc8c3a4ff0',
+ 'md5': '1bae984fe7b1f78e94abc74e802ed99f',
'info_dict': {
'id': '5745356',
'display_id': 'londres-cachee-par-un-epais-smog',
'duration': 33,
'title': 'Londres cachée par un épais smog',
'description': 'Un important voile de smog recouvre Londres depuis mercredi, provoqué par la pollution et du sable du Sahara.',
- 'uploader': 'Le Journal en continu',
+ 'uploader': 'L\'actu en vidéo',
'upload_date': '20140403',
'timestamp': 1396537322,
- 'thumbnail': 're:^https?://.*\.image',
+ 'thumbnail': r're:^https?://.*\.image',
'view_count': int,
},
- 'params': {
- # m3u8 download
- 'skip_download': True,
- }
},
{
'url': 'http://www.rts.ch/audio/couleur3/programmes/la-belle-video-de-stephane-laurenceau/5706148-urban-hippie-de-damien-krisl-03-04-2014.html',
'title': 'Hockey: Davos décroche son 31e titre de champion de Suisse',
},
'playlist_mincount': 5,
+ },
+ {
+ 'url': 'http://pages.rts.ch/emissions/passe-moi-les-jumelles/5624065-entre-ciel-et-mer.html',
+ 'only_matching': True,
}
]
# media_id extracted out of URL is not always a real id
if 'video' not in all_info and 'audio' not in all_info:
- page = self._download_webpage(url, display_id)
+ entries = []
- # article with videos on rhs
- videos = re.findall(
- r'<article[^>]+class="content-item"[^>]*>\s*<a[^>]+data-video-urn="urn:([^"]+)"',
- page)
- if not videos:
+ for item in all_info.get('items', []):
+ item_url = item.get('url')
+ if not item_url:
+ continue
+ entries.append(self.url_result(item_url, 'RTS'))
+
+ if not entries:
+ page, urlh = self._download_webpage_handle(url, display_id)
+ if re.match(self._VALID_URL, urlh.geturl()).group('id') != media_id:
+ return self.url_result(urlh.geturl(), 'RTS')
+
+ # article with videos on rhs
videos = re.findall(
- r'(?s)<iframe[^>]+class="srg-player"[^>]+src="[^"]+urn:([^"]+)"',
+ r'<article[^>]+class="content-item"[^>]*>\s*<a[^>]+data-video-urn="urn:([^"]+)"',
page)
- if videos:
- entries = [self.url_result('srgssr:%s' % video_urn, 'SRGSSR') for video_urn in videos]
- return self.playlist_result(entries, media_id, self._og_search_title(page))
+ if not videos:
+ videos = re.findall(
+ r'(?s)<iframe[^>]+class="srg-player"[^>]+src="[^"]+urn:([^"]+)"',
+ page)
+ if videos:
+ entries = [self.url_result('srgssr:%s' % video_urn, 'SRGSSR') for video_urn in videos]
+
+ if entries:
+ return self.playlist_result(entries, media_id, all_info.get('title'))
internal_id = self._html_search_regex(
r'<(?:video|audio) data-id="([0-9]+)"', page,
info = all_info['video']['JSONinfo'] if 'video' in all_info else all_info['audio']
- upload_timestamp = parse_iso8601(info.get('broadcast_date'))
- duration = info.get('duration') or info.get('cutout') or info.get('cutduration')
- if isinstance(duration, compat_str):
- duration = parse_duration(duration)
- view_count = info.get('plays')
- thumbnail = unescapeHTML(info.get('preview_image_url'))
+ title = info['title']
def extract_bitrate(url):
return int_or_none(self._search_regex(
r'-([0-9]+)k\.', url, 'bitrate', default=None))
formats = []
- for format_id, format_url in info['streams'].items():
- if format_id == 'hds_sd' and 'hds' in info['streams']:
+ streams = info.get('streams', {})
+ for format_id, format_url in streams.items():
+ if format_id == 'hds_sd' and 'hds' in streams:
continue
- if format_id == 'hls_sd' and 'hls' in info['streams']:
+ if format_id == 'hls_sd' and 'hls' in streams:
continue
- if format_url.endswith('.f4m'):
- token = self._download_xml(
- 'http://tp.srgssr.ch/token/akahd.xml?stream=%s/*' % compat_urllib_parse_urlparse(format_url).path,
- media_id, 'Downloading %s token' % format_id)
- auth_params = xpath_text(token, './/authparams', 'auth params')
- if not auth_params:
- continue
- formats.extend(self._extract_f4m_formats(
- '%s?%s&hdcore=3.4.0&plugin=aasp-3.4.0.132.66' % (format_url, auth_params),
- media_id, f4m_id=format_id, fatal=False))
- elif format_url.endswith('.m3u8'):
- formats.extend(self._extract_m3u8_formats(
- format_url, media_id, 'mp4', 'm3u8_native', m3u8_id=format_id, fatal=False))
+ ext = determine_ext(format_url)
+ if ext in ('m3u8', 'f4m'):
+ format_url = self._get_tokenized_src(format_url, media_id, format_id)
+ if ext == 'f4m':
+ formats.extend(self._extract_f4m_formats(
+ format_url + ('?' if '?' not in format_url else '&') + 'hdcore=3.4.0',
+ media_id, f4m_id=format_id, fatal=False))
+ else:
+ formats.extend(self._extract_m3u8_formats(
+ format_url, media_id, 'mp4', 'm3u8_native', m3u8_id=format_id, fatal=False))
else:
formats.append({
'format_id': format_id,
'tbr': extract_bitrate(format_url),
})
- if 'media' in info:
- formats.extend([{
- 'format_id': '%s-%sk' % (media['ext'], media['rate']),
- 'url': 'http://download-video.rts.ch/%s' % media['url'],
- 'tbr': media['rate'] or extract_bitrate(media['url']),
- } for media in info['media'] if media.get('rate')])
+ for media in info.get('media', []):
+ media_url = media.get('url')
+ if not media_url or re.match(r'https?://', media_url):
+ continue
+ rate = media.get('rate')
+ ext = media.get('ext') or determine_ext(media_url, 'mp4')
+ format_id = ext
+ if rate:
+ format_id += '-%dk' % rate
+ formats.append({
+ 'format_id': format_id,
+ 'url': 'http://download-video.rts.ch/' + media_url,
+ 'tbr': rate or extract_bitrate(media_url),
+ })
self._check_formats(formats, media_id)
self._sort_formats(formats)
+ duration = info.get('duration') or info.get('cutout') or info.get('cutduration')
+ if isinstance(duration, compat_str):
+ duration = parse_duration(duration)
+
return {
'id': media_id,
'display_id': display_id,
'formats': formats,
- 'title': info['title'],
+ 'title': title,
'description': info.get('intro'),
'duration': duration,
- 'view_count': view_count,
+ 'view_count': int_or_none(info.get('plays')),
'uploader': info.get('programName'),
- 'timestamp': upload_timestamp,
- 'thumbnail': thumbnail,
+ 'timestamp': parse_iso8601(info.get('broadcast_date')),
+ 'thumbnail': unescapeHTML(info.get('preview_image_url')),
}
title += ' ' + time.strftime('%Y-%m-%dZ%H%M%S', start_time)
vidplayer_id = self._search_regex(
- r'playerId=player([0-9]+)', webpage, 'internal video ID')
+ (r'playerId=player([0-9]+)',
+ r'class=["\'].*?\blive_mod\b.*?["\'][^>]+data-assetid=["\'](\d+)',
+ r'data-id=["\'](\d+)'),
+ webpage, 'internal video ID')
png_url = 'http://www.rtve.es/ztnr/movil/thumbnail/amonet/videos/%s.png' % vidplayer_id
png = self._download_webpage(png_url, video_id, 'Downloading url information')
m3u8_url = _decrypt_url(png)
'id': '131946',
'ext': 'mp4',
'title': 'Grote zoektocht in zee bij Zandvoort naar vermiste vrouw',
- 'thumbnail': 're:^https?:.*\.jpg$'
+ 'thumbnail': r're:^https?:.*\.jpg$'
}
}
@classmethod
def _extract_url(self, webpage):
mobj = re.search(
- '<iframe[^>]+src=(?P<q1>[\'"])(?P<url>(?:https?:)?//rudo\.video/vod/[0-9a-zA-Z]+)(?P=q1)',
+ r'<iframe[^>]+src=(?P<q1>[\'"])(?P<url>(?:https?:)?//rudo\.video/vod/[0-9a-zA-Z]+)(?P=q1)',
webpage)
if mobj:
return mobj.group('url')
'ext': 'divx',
'title': 'КОТ бааааам',
'description': 'классный кот)',
- 'thumbnail': 're:^http://.*\.jpg$',
+ 'thumbnail': r're:^http://.*\.jpg$',
}
}
'ext': 'mp4',
'title': 'Oletko aina halunnut tietää mitä tapahtuu vain hetki ennen lähetystä? - Nyt se selvisi!',
'description': 'md5:cfc6ccf0e57a814360df464a91ff67d6',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'duration': 114,
'age_limit': 0,
},
'ext': 'mp4',
'title': 'Superpesis: katso koko kausi Ruudussa',
'description': 'md5:bfb7336df2a12dc21d18fa696c9f8f23',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'duration': 40,
'age_limit': 0,
},
'ext': 'mp4',
'title': 'Osa 1: Mikael Jungner',
'description': 'md5:7d90f358c47542e3072ff65d7b1bcffe',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'age_limit': 0,
},
},
elif ext == 'f4m':
formats.extend(self._extract_f4m_formats(
video_url, video_id, f4m_id='hds', fatal=False))
+ elif ext == 'mpd':
+ formats.extend(self._extract_mpd_formats(
+ video_url, video_id, mpd_id='dash', fatal=False))
else:
proto = compat_urllib_parse_urlparse(video_url).scheme
if not child.tag.startswith('HTTP') and proto != 'rtmp':
'upload_date': '20120816',
'uploader': 'Howcast',
'uploader_id': 'Howcast',
- 'description': 're:(?s).* Hi, my name is Rene Dreifuss\. And I\'m here to show you some MMA.*',
+ 'description': r're:(?s).* Hi, my name is Rene Dreifuss\. And I\'m here to show you some MMA.*',
},
'params': {
'skip_download': True
'ext': 'mp4',
'title': 'Dingo Conservation (The Feed)',
'description': 'md5:f250a9856fca50d22dec0b5b8015f8a5',
- 'thumbnail': 're:http://.*\.jpg',
+ 'thumbnail': r're:http://.*\.jpg',
'duration': 308,
'timestamp': 1408613220,
'upload_date': '20140821',
'ext': 'm4v',
'title': 'Color Measurement with Ocean Optics Spectrometers',
'description': 'md5:240369cde69d8bed61349a199c5fb153',
- 'thumbnail': 're:^https?://.*\.(?:gif|jpg)$',
+ 'thumbnail': r're:^https?://.*\.(?:gif|jpg)$',
}
}, {
'url': 'http://www.screencast.com/t/V2uXehPJa1ZI',
'ext': 'mov',
'title': 'The Amadeus Spectrometer',
'description': 're:^In this video, our friends at.*To learn more about Amadeus, visit',
- 'thumbnail': 're:^https?://.*\.(?:gif|jpg)$',
+ 'thumbnail': r're:^https?://.*\.(?:gif|jpg)$',
}
}, {
'url': 'http://www.screencast.com/t/aAB3iowa',
'ext': 'mp4',
'title': 'Google Earth Export',
'description': 'Provides a demo of a CommunityViz export to Google Earth, one of the 3D viewing options.',
- 'thumbnail': 're:^https?://.*\.(?:gif|jpg)$',
+ 'thumbnail': r're:^https?://.*\.(?:gif|jpg)$',
}
}, {
'url': 'http://www.screencast.com/t/X3ddTrYh',
'ext': 'wmv',
'title': 'Toolkit 6 User Group Webinar (2014-03-04) - Default Judgment and First Impression',
'description': 'md5:7b9f393bc92af02326a5c5889639eab0',
- 'thumbnail': 're:^https?://.*\.(?:gif|jpg)$',
+ 'thumbnail': r're:^https?://.*\.(?:gif|jpg)$',
}
}, {
'url': 'http://screencast.com/t/aAB3iowa',
'id': 'c2lD3BeOPl',
'ext': 'mp4',
'title': 'Welcome to 3-4 Philosophy @ DECV!',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'description': 'as the title says! also: some general info re 1) VCE philosophy and 2) distance learning.',
'duration': 369.163,
}
+++ /dev/null
-from __future__ import unicode_literals
-
-import re
-
-from .common import InfoExtractor
-from ..compat import compat_str
-from ..utils import (
- int_or_none,
- parse_age_limit,
-)
-
-
-class ScreenJunkiesIE(InfoExtractor):
- _VALID_URL = r'https?://(?:www\.)?screenjunkies\.com/video/(?P<display_id>[^/]+?)(?:-(?P<id>\d+))?(?:[/?#&]|$)'
- _TESTS = [{
- 'url': 'http://www.screenjunkies.com/video/best-quentin-tarantino-movie-2841915',
- 'md5': '5c2b686bec3d43de42bde9ec047536b0',
- 'info_dict': {
- 'id': '2841915',
- 'display_id': 'best-quentin-tarantino-movie',
- 'ext': 'mp4',
- 'title': 'Best Quentin Tarantino Movie',
- 'thumbnail': 're:^https?://.*\.jpg',
- 'duration': 3671,
- 'age_limit': 13,
- 'tags': list,
- },
- }, {
- 'url': 'http://www.screenjunkies.com/video/honest-trailers-the-dark-knight',
- 'info_dict': {
- 'id': '2348808',
- 'display_id': 'honest-trailers-the-dark-knight',
- 'ext': 'mp4',
- 'title': "Honest Trailers: 'The Dark Knight'",
- 'thumbnail': 're:^https?://.*\.jpg',
- 'age_limit': 10,
- 'tags': list,
- },
- }, {
- # requires subscription but worked around
- 'url': 'http://www.screenjunkies.com/video/knocking-dead-ep-1-the-show-so-far-3003285',
- 'info_dict': {
- 'id': '3003285',
- 'display_id': 'knocking-dead-ep-1-the-show-so-far',
- 'ext': 'mp4',
- 'title': 'Knocking Dead Ep 1: State of The Dead Recap',
- 'thumbnail': 're:^https?://.*\.jpg',
- 'duration': 3307,
- 'age_limit': 13,
- 'tags': list,
- },
- }]
-
- _DEFAULT_BITRATES = (48, 150, 496, 864, 2240)
-
- def _real_extract(self, url):
- mobj = re.match(self._VALID_URL, url)
- video_id = mobj.group('id')
- display_id = mobj.group('display_id')
-
- if not video_id:
- webpage = self._download_webpage(url, display_id)
- video_id = self._search_regex(
- (r'src=["\']/embed/(\d+)', r'data-video-content-id=["\'](\d+)'),
- webpage, 'video id')
-
- webpage = self._download_webpage(
- 'http://www.screenjunkies.com/embed/%s' % video_id,
- display_id, 'Downloading video embed page')
- embed_vars = self._parse_json(
- self._search_regex(
- r'(?s)embedVars\s*=\s*({.+?})\s*</script>', webpage, 'embed vars'),
- display_id)
-
- title = embed_vars['contentName']
-
- formats = []
- bitrates = []
- for f in embed_vars.get('media', []):
- if not f.get('uri') or f.get('mediaPurpose') != 'play':
- continue
- bitrate = int_or_none(f.get('bitRate'))
- if bitrate:
- bitrates.append(bitrate)
- formats.append({
- 'url': f['uri'],
- 'format_id': 'http-%d' % bitrate if bitrate else 'http',
- 'width': int_or_none(f.get('width')),
- 'height': int_or_none(f.get('height')),
- 'tbr': bitrate,
- 'format': 'mp4',
- })
-
- if not bitrates:
- # When subscriptionLevel > 0, i.e. plus subscription is required
- # media list will be empty. However, hds and hls uris are still
- # available. We can grab them assuming bitrates to be default.
- bitrates = self._DEFAULT_BITRATES
-
- auth_token = embed_vars.get('AuthToken')
-
- def construct_manifest_url(base_url, ext):
- pieces = [base_url]
- pieces.extend([compat_str(b) for b in bitrates])
- pieces.append('_kbps.mp4.%s?%s' % (ext, auth_token))
- return ','.join(pieces)
-
- if bitrates and auth_token:
- hds_url = embed_vars.get('hdsUri')
- if hds_url:
- f4m_formats = self._extract_f4m_formats(
- construct_manifest_url(hds_url, 'f4m'),
- display_id, f4m_id='hds', fatal=False)
- if len(f4m_formats) == len(bitrates):
- for f, bitrate in zip(f4m_formats, bitrates):
- if not f.get('tbr'):
- f['format_id'] = 'hds-%d' % bitrate
- f['tbr'] = bitrate
- # TODO: fix f4m downloader to handle manifests without bitrates if possible
- # formats.extend(f4m_formats)
-
- hls_url = embed_vars.get('hlsUri')
- if hls_url:
- formats.extend(self._extract_m3u8_formats(
- construct_manifest_url(hls_url, 'm3u8'),
- display_id, 'mp4', entry_protocol='m3u8_native', m3u8_id='hls', fatal=False))
- self._sort_formats(formats)
-
- return {
- 'id': video_id,
- 'display_id': display_id,
- 'title': title,
- 'thumbnail': embed_vars.get('thumbUri'),
- 'duration': int_or_none(embed_vars.get('videoLengthInSeconds')) or None,
- 'age_limit': parse_age_limit(embed_vars.get('audienceRating')),
- 'tags': embed_vars.get('tags', '').split(','),
- 'formats': formats,
- }
'id': 'judiciary031715',
'ext': 'mp4',
'title': 'Integrated Senate Video Player',
- 'thumbnail': 're:^https?://.*\.(?:jpg|png)$',
+ 'thumbnail': r're:^https?://.*\.(?:jpg|png)$',
},
'params': {
# m3u8 download
float_or_none,
parse_iso8601,
update_url_query,
+ int_or_none,
+ determine_protocol,
+ unescapeHTML,
)
'info_dict': {
'id': 'GxfCe0Zo7D-175909-5588'
},
- 'playlist_count': 9,
+ 'playlist_count': 8,
# test the first video only to prevent lengthy tests
'playlist': [{
'info_dict': {
- 'id': '198180',
+ 'id': '240385',
'ext': 'mp4',
- 'title': 'Recap: CLE 5, LAA 4',
- 'description': '8/14/16: Naquin, Almonte lead Indians in 5-4 win',
- 'duration': 57.343,
- 'thumbnail': 're:https?://.*\.jpg$',
- 'upload_date': '20160815',
- 'timestamp': 1471221961,
+ 'title': 'Indians introduce Encarnacion',
+ 'description': 'Indians president of baseball operations Chris Antonetti and Edwin Encarnacion discuss the slugger\'s three-year contract with Cleveland',
+ 'duration': 137.898,
+ 'thumbnail': r're:https?://.*\.jpg$',
+ 'upload_date': '20170105',
+ 'timestamp': 1483649762,
},
}],
'params': {
for video in playlist_data['playlistData'][0]:
info_dict = self._parse_jwplayer_data(
video['jwconfiguration'],
- require_title=False, rtmp_params={'no_resume': True})
+ require_title=False, m3u8_id='hls', rtmp_params={'no_resume': True})
+
+ for f in info_dict['formats']:
+ if f.get('tbr'):
+ continue
+ tbr = int_or_none(self._search_regex(
+ r'/(\d+)k/', f['url'], 'bitrate', default=None))
+ if not tbr:
+ continue
+ f.update({
+ 'format_id': '%s-%d' % (determine_protocol(f), tbr),
+ 'tbr': tbr,
+ })
+ self._sort_formats(info_dict['formats'], ('tbr', 'height', 'width', 'format_id'))
thumbnails = []
if video.get('thumbnailUrl'):
'url': video['smThumbnailUrl'],
})
info_dict.update({
- 'title': video['S_headLine'],
- 'description': video.get('S_fullStory'),
+ 'title': video['S_headLine'].strip(),
+ 'description': unescapeHTML(video.get('S_fullStory')),
'thumbnails': thumbnails,
'duration': float_or_none(video.get('SM_length')),
'timestamp': parse_iso8601(video.get('S_sysDate'), delimiter=' '),
'title': 'md5:4d05a19a5fc049a63dbbaf05fb71d91b',
'description': 'md5:2b75327061310a3afb3fbd7d09e2e403',
'categories': list, # NSFW
- 'thumbnail': 're:https?://.*\.jpg$',
+ 'thumbnail': r're:https?://.*\.jpg$',
'age_limit': 18,
}
}
+++ /dev/null
-# coding: utf-8
-from __future__ import unicode_literals
-
-import re
-
-from .common import InfoExtractor
-from ..utils import (
- parse_duration,
- sanitized_Request,
- urlencode_postdata,
-)
-
-
-class ShareSixIE(InfoExtractor):
- _VALID_URL = r'https?://(?:www\.)?sharesix\.com/(?:f/)?(?P<id>[0-9a-zA-Z]+)'
- _TESTS = [
- {
- 'url': 'http://sharesix.com/f/OXjQ7Y6',
- 'md5': '9e8e95d8823942815a7d7c773110cc93',
- 'info_dict': {
- 'id': 'OXjQ7Y6',
- 'ext': 'mp4',
- 'title': 'big_buck_bunny_480p_surround-fix.avi',
- 'duration': 596,
- 'width': 854,
- 'height': 480,
- },
- },
- {
- 'url': 'http://sharesix.com/lfrwoxp35zdd',
- 'md5': 'dd19f1435b7cec2d7912c64beeee8185',
- 'info_dict': {
- 'id': 'lfrwoxp35zdd',
- 'ext': 'flv',
- 'title': 'WhiteBoard___a_Mac_vs_PC_Parody_Cartoon.mp4.flv',
- 'duration': 65,
- 'width': 1280,
- 'height': 720,
- },
- }
- ]
-
- def _real_extract(self, url):
- mobj = re.match(self._VALID_URL, url)
- video_id = mobj.group('id')
-
- fields = {
- 'method_free': 'Free'
- }
- post = urlencode_postdata(fields)
- req = sanitized_Request(url, post)
- req.add_header('Content-type', 'application/x-www-form-urlencoded')
-
- webpage = self._download_webpage(req, video_id,
- 'Downloading video page')
-
- video_url = self._search_regex(
- r"var\slnk1\s=\s'([^']+)'", webpage, 'video URL')
- title = self._html_search_regex(
- r'(?s)<dt>Filename:</dt>.+?<dd>(.+?)</dd>', webpage, 'title')
- duration = parse_duration(
- self._search_regex(
- r'(?s)<dt>Length:</dt>.+?<dd>(.+?)</dd>',
- webpage,
- 'duration',
- fatal=False
- )
- )
-
- m = re.search(
- r'''(?xs)<dt>Width\sx\sHeight</dt>.+?
- <dd>(?P<width>\d+)\sx\s(?P<height>\d+)</dd>''',
- webpage
- )
- width = height = None
- if m:
- width, height = int(m.group('width')), int(m.group('height'))
-
- formats = [{
- 'format_id': 'sd',
- 'url': video_url,
- 'width': width,
- 'height': height,
- }]
-
- return {
- 'id': video_id,
- 'title': title,
- 'duration': duration,
- 'formats': formats,
- }
--- /dev/null
+# coding: utf-8
+from __future__ import unicode_literals
+
+from .common import InfoExtractor
+from ..compat import compat_str
+from ..utils import (
+ ExtractorError,
+ int_or_none,
+ urljoin,
+)
+
+
+class ShowRoomLiveIE(InfoExtractor):
+ _VALID_URL = r'https?://(?:www\.)?showroom-live\.com/(?!onlive|timetable|event|campaign|news|ranking|room)(?P<id>[^/?#&]+)'
+ _TEST = {
+ 'url': 'https://www.showroom-live.com/48_Nana_Okada',
+ 'only_matching': True,
+ }
+
+ def _real_extract(self, url):
+ broadcaster_id = self._match_id(url)
+
+ webpage = self._download_webpage(url, broadcaster_id)
+
+ room_id = self._search_regex(
+ (r'SrGlobal\.roomId\s*=\s*(\d+)',
+ r'(?:profile|room)\?room_id\=(\d+)'), webpage, 'room_id')
+
+ room = self._download_json(
+ urljoin(url, '/api/room/profile?room_id=%s' % room_id),
+ broadcaster_id)
+
+ is_live = room.get('is_onlive')
+ if is_live is not True:
+ raise ExtractorError('%s is offline' % broadcaster_id, expected=True)
+
+ uploader = room.get('performer_name') or broadcaster_id
+ title = room.get('room_name') or room.get('main_name') or uploader
+
+ streaming_url_list = self._download_json(
+ urljoin(url, '/api/live/streaming_url?room_id=%s' % room_id),
+ broadcaster_id)['streaming_url_list']
+
+ formats = []
+ for stream in streaming_url_list:
+ stream_url = stream.get('url')
+ if not stream_url:
+ continue
+ stream_type = stream.get('type')
+ if stream_type == 'hls':
+ m3u8_formats = self._extract_m3u8_formats(
+ stream_url, broadcaster_id, ext='mp4', m3u8_id='hls',
+ live=True)
+ for f in m3u8_formats:
+ f['quality'] = int_or_none(stream.get('quality', 100))
+ formats.extend(m3u8_formats)
+ elif stream_type == 'rtmp':
+ stream_name = stream.get('stream_name')
+ if not stream_name:
+ continue
+ formats.append({
+ 'url': stream_url,
+ 'play_path': stream_name,
+ 'page_url': url,
+ 'player_url': 'https://www.showroom-live.com/assets/swf/v3/ShowRoomLive.swf',
+ 'rtmp_live': True,
+ 'ext': 'flv',
+ 'format_id': 'rtmp',
+ 'format_note': stream.get('label'),
+ 'quality': int_or_none(stream.get('quality', 100)),
+ })
+ self._sort_formats(formats)
+
+ return {
+ 'id': compat_str(room.get('live_id') or broadcaster_id),
+ 'title': self._live_title(title),
+ 'description': room.get('description'),
+ 'timestamp': int_or_none(room.get('current_live_started_at')),
+ 'uploader': uploader,
+ 'uploader_id': broadcaster_id,
+ 'view_count': int_or_none(room.get('view_num')),
+ 'formats': formats,
+ 'is_live': True,
+ }
from __future__ import unicode_literals
from .common import InfoExtractor
+from ..utils import strip_or_none
class SkySportsIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?skysports\.com/watch/video/(?P<id>[0-9]+)'
_TEST = {
'url': 'http://www.skysports.com/watch/video/10328419/bale-its-our-time-to-shine',
- 'md5': 'c44a1db29f27daf9a0003e010af82100',
+ 'md5': '77d59166cddc8d3cb7b13e35eaf0f5ec',
'info_dict': {
'id': '10328419',
- 'ext': 'flv',
- 'title': 'Bale: Its our time to shine',
- 'description': 'md5:9fd1de3614d525f5addda32ac3c482c9',
+ 'ext': 'mp4',
+ 'title': 'Bale: It\'s our time to shine',
+ 'description': 'md5:e88bda94ae15f7720c5cb467e777bb6d',
},
'add_ie': ['Ooyala'],
}
'url': 'ooyala:%s' % self._search_regex(
r'data-video-id="([^"]+)"', webpage, 'ooyala id'),
'title': self._og_search_title(webpage),
- 'description': self._og_search_description(webpage),
+ 'description': strip_or_none(self._og_search_description(webpage)),
'ie_key': 'Ooyala',
}
'ext': 'mp4',
'title': 'virginie baisee en cam',
'age_limit': 18,
- 'thumbnail': 're:https?://.*?\.jpg'
+ 'thumbnail': r're:https?://.*?\.jpg'
}
}
'uploader': 'psavari1',
'uploader_id': 'psavari1',
'upload_date': '20081103',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
},
'params': {
'videopassword': '223322',
'uploader': 'вАся',
'uploader_id': 'asya_prosto',
'upload_date': '20081218',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'age_limit': 18,
},
'params': {
'duration': 248,
'filesize_approx': 40700000,
'description': 'A drone flying through Fourth of July Fireworks',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
},
'expected_warnings': ['description'],
}, {
'duration': 126,
'filesize_approx': 8500000,
'description': 'The top 10 George W. Bush moments, brought to you by David Letterman!',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
}
}]
})
# We have to retrieve the url
- streams_url = ('http://api.soundcloud.com/i1/tracks/{0}/streams?'
- 'client_id={1}&secret_token={2}'.format(track_id, self._IPHONE_CLIENT_ID, secret_token))
format_dict = self._download_json(
- streams_url,
- track_id, 'Downloading track url')
+ 'http://api.soundcloud.com/i1/tracks/%s/streams' % track_id,
+ track_id, 'Downloading track url', query={
+ 'client_id': self._CLIENT_ID,
+ 'secret_token': secret_token,
+ })
for key, stream_url in format_dict.items():
+ abr = int_or_none(self._search_regex(
+ r'_(\d+)_url', key, 'audio bitrate', default=None))
if key.startswith('http'):
- formats.append({
+ stream_formats = [{
'format_id': key,
'ext': ext,
'url': stream_url,
- 'vcodec': 'none',
- })
+ }]
elif key.startswith('rtmp'):
# The url doesn't have an rtmp app, we have to extract the playpath
url, path = stream_url.split('mp3:', 1)
- formats.append({
+ stream_formats = [{
'format_id': key,
'url': url,
'play_path': 'mp3:' + path,
'ext': 'flv',
- 'vcodec': 'none',
- })
-
- if not formats:
- # We fallback to the stream_url in the original info, this
- # cannot be always used, sometimes it can give an HTTP 404 error
- formats.append({
- 'format_id': 'fallback',
- 'url': info['stream_url'] + '?client_id=' + self._CLIENT_ID,
- 'ext': ext,
- 'vcodec': 'none',
- })
-
- for f in formats:
- if f['format_id'].startswith('http'):
- f['protocol'] = 'http'
- if f['format_id'].startswith('rtmp'):
- f['protocol'] = 'rtmp'
+ }]
+ elif key.startswith('hls'):
+ stream_formats = self._extract_m3u8_formats(
+ stream_url, track_id, 'mp3', entry_protocol='m3u8_native',
+ m3u8_id=key, fatal=False)
+ else:
+ continue
+
+ for f in stream_formats:
+ f['abr'] = abr
+
+ formats.extend(stream_formats)
+
+ if not formats:
+ # We fallback to the stream_url in the original info, this
+ # cannot be always used, sometimes it can give an HTTP 404 error
+ formats.append({
+ 'format_id': 'fallback',
+ 'url': info['stream_url'] + '?client_id=' + self._CLIENT_ID,
+ 'ext': ext,
+ })
+
+ for f in formats:
+ f['vcodec'] = 'none'
self._check_formats(formats, track_id)
self._sort_formats(formats)
webpage = self._download_webpage(url, display_id)
audio_url = self._html_search_regex(
r'(?s)m4a\:\s"([^"]+)"', webpage, 'audio URL')
- audio_id = re.split('\/|\.', audio_url)[-2]
+ audio_id = re.split(r'\/|\.', audio_url)[-2]
description = self._html_search_regex(
r'(?s)<li>Description:\s(.*?)<\/li>', webpage, 'description',
fatal=False)
class SouthParkIE(MTVServicesInfoExtractor):
IE_NAME = 'southpark.cc.com'
- _VALID_URL = r'https?://(?:www\.)?(?P<url>southpark\.cc\.com/(?:clips|full-episodes)/(?P<id>.+?)(\?|#|$))'
+ _VALID_URL = r'https?://(?:www\.)?(?P<url>southpark\.cc\.com/(?:clips|(?:full-)?episodes)/(?P<id>.+?)(\?|#|$))'
_FEED_URL = 'http://www.southparkstudios.com/feeds/video-player/mrss'
class SouthParkNlIE(SouthParkIE):
IE_NAME = 'southpark.nl'
- _VALID_URL = r'https?://(?:www\.)?(?P<url>southpark\.nl/(?:clips|full-episodes)/(?P<id>.+?)(\?|#|$))'
+ _VALID_URL = r'https?://(?:www\.)?(?P<url>southpark\.nl/(?:clips|(?:full-)?episodes)/(?P<id>.+?)(\?|#|$))'
_FEED_URL = 'http://www.southpark.nl/feeds/video-player/mrss/'
_TESTS = [{
'ext': 'mp4',
'title': 'fantasy solo',
'description': 'Watch fantasy solo free HD porn video - 05 minutes - dillion harper masturbates on a bed free adult movies.',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'uploader': 'silly2587',
'age_limit': 18,
}
r'playerData\.cdnPath([0-9]{3,})\s*=\s*(?:encodeURIComponent\()?["\']([^"\']+)["\']', webpage)
heights = [int(video[0]) for video in videos]
video_urls = list(map(compat_urllib_parse_unquote, [video[1] for video in videos]))
- if webpage.find('flashvars\.encrypted = "true"') != -1:
+ if webpage.find(r'flashvars\.encrypted = "true"') != -1:
password = self._search_regex(
r'flashvars\.video_title = "([^"]+)',
webpage, 'password').replace('+', ' ')
'ext': 'm4v',
'title': 'Flug MH370',
'description': 'Das Rätsel um die Boeing 777 der Malaysia-Airlines',
- 'thumbnail': 're:http://.*\.jpg$',
+ 'thumbnail': r're:http://.*\.jpg$',
},
'params': {
# m3u8 download
_CUSTOM_URL_REGEX = re.compile(r'spikenetworkapp://([^/]+/[-a-fA-F0-9]+)')
def _extract_mgid(self, webpage):
- mgid = super(SpikeIE, self)._extract_mgid(webpage, default=None)
+ mgid = super(SpikeIE, self)._extract_mgid(webpage)
if mgid is None:
url_parts = self._search_regex(self._CUSTOM_URL_REGEX, webpage, 'episode_id')
video_type, episode_id = url_parts.split('/', 1)
webpage = self._download_webpage(url, media_id)
- video_id = self._html_search_regex('clipId=([\w-]+)', webpage, 'video id')
+ video_id = self._html_search_regex(r'clipId=([\w-]+)', webpage, 'video id')
metadata = self._download_xml(
'http://sport5-metadata-rr-d.nsacdn.com/vod/vod/%s/HDS/metadata.xml' % video_id,
import re
from .common import InfoExtractor
-from ..compat import compat_urlparse
-from ..utils import (
- js_to_json,
- unified_strdate,
-)
-
-
-class SportBoxIE(InfoExtractor):
- _VALID_URL = r'https?://news\.sportbox\.ru/(?:[^/]+/)+spbvideo_NI\d+_(?P<display_id>.+)'
- _TESTS = [{
- 'url': 'http://news.sportbox.ru/Vidy_sporta/Avtosport/Rossijskij/spbvideo_NI483529_Gonka-2-zaezd-Obyedinenniy-2000-klassi-Turing-i-S',
- 'md5': 'ff56a598c2cf411a9a38a69709e97079',
- 'info_dict': {
- 'id': '80822',
- 'ext': 'mp4',
- 'title': 'Гонка 2 заезд ««Объединенный 2000»: классы Туринг и Супер-продакшн',
- 'description': 'md5:3d72dc4a006ab6805d82f037fdc637ad',
- 'thumbnail': 're:^https?://.*\.jpg$',
- 'upload_date': '20140928',
- },
- 'params': {
- # m3u8 download
- 'skip_download': True,
- },
- }, {
- 'url': 'http://news.sportbox.ru/Vidy_sporta/billiard/spbvideo_NI486287_CHempionat-mira-po-dinamichnoy-piramide-4',
- 'only_matching': True,
- }, {
- 'url': 'http://news.sportbox.ru/video/no_ads/spbvideo_NI536574_V_Novorossijske_proshel_detskij_turnir_Pole_slavy_bojevoj?ci=211355',
- 'only_matching': True,
- }]
-
- def _real_extract(self, url):
- mobj = re.match(self._VALID_URL, url)
- display_id = mobj.group('display_id')
-
- webpage = self._download_webpage(url, display_id)
-
- player = self._search_regex(
- r'src="/?(vdl/player/[^"]+)"', webpage, 'player')
-
- title = self._html_search_regex(
- [r'"nodetitle"\s*:\s*"([^"]+)"', r'class="node-header_{1,2}title">([^<]+)'],
- webpage, 'title')
- description = self._og_search_description(webpage) or self._html_search_meta(
- 'description', webpage, 'description')
- thumbnail = self._og_search_thumbnail(webpage)
- upload_date = unified_strdate(self._html_search_meta(
- 'dateCreated', webpage, 'upload date'))
-
- return {
- '_type': 'url_transparent',
- 'url': compat_urlparse.urljoin(url, '/%s' % player),
- 'display_id': display_id,
- 'title': title,
- 'description': description,
- 'thumbnail': thumbnail,
- 'upload_date': upload_date,
- }
+from ..utils import js_to_json
class SportBoxEmbedIE(InfoExtractor):
'id': '211355',
'ext': 'mp4',
'title': 'В Новороссийске прошел детский турнир «Поле славы боевой»',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
},
'params': {
# m3u8 download
'title': 're:Li-Ning Badminton Weltmeisterschaft 2014 Kopenhagen',
'categories': ['Badminton'],
'view_count': int,
- 'thumbnail': 're:^https?://.*\.jpg$',
- 'description': 're:Die Badminton-WM 2014 aus Kopenhagen bei Sportdeutschland\.TV',
+ 'thumbnail': r're:^https?://.*\.jpg$',
+ 'description': r're:Die Badminton-WM 2014 aus Kopenhagen bei Sportdeutschland\.TV',
'timestamp': int,
'upload_date': 're:^201408[23][0-9]$',
},
'timestamp': 1408976060,
'duration': 2732,
'title': 'Li-Ning Badminton Weltmeisterschaft 2014 Kopenhagen: Herren Einzel, Wei Lee vs. Keun Lee',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'view_count': int,
'categories': ['Li-Ning Badminton WM 2014'],
import re
from .common import InfoExtractor
+from ..compat import compat_urllib_parse_urlparse
from ..utils import (
ExtractorError,
parse_iso8601,
'STARTDATE': 'This video is not yet available. Please try again later.',
}
+ def _get_tokenized_src(self, url, video_id, format_id):
+ sp = compat_urllib_parse_urlparse(url).path.split('/')
+ token = self._download_json(
+ 'http://tp.srgssr.ch/akahd/token?acl=/%s/%s/*' % (sp[1], sp[2]),
+ video_id, 'Downloading %s token' % format_id, fatal=False) or {}
+ auth_params = token.get('token', {}).get('authparams')
+ if auth_params:
+ url += '?' + auth_params
+ return url
+
def get_media_data(self, bu, media_type, media_id):
media_data = self._download_json(
'http://il.srgssr.ch/integrationlayer/1.0/ue/%s/%s/play/%s.json' % (bu, media_type, media_id),
def _real_extract(self, url):
bu, media_type, media_id = re.match(self._VALID_URL, url).groups()
- if bu == 'rts':
- return self.url_result('rts:%s' % media_id, 'RTS')
-
media_data = self.get_media_data(bu, media_type, media_id)
metadata = media_data['AssetMetadatas']['AssetMetadata'][0]
asset_url = asset['text']
quality = asset['@quality']
format_id = '%s-%s' % (protocol, quality)
- if protocol == 'HTTP-HDS':
- formats.extend(self._extract_f4m_formats(
- asset_url + '?hdcore=3.4.0', media_id,
- f4m_id=format_id, fatal=False))
- elif protocol == 'HTTP-HLS':
- formats.extend(self._extract_m3u8_formats(
- asset_url, media_id, 'mp4', 'm3u8_native',
- m3u8_id=format_id, fatal=False))
+ if protocol.startswith('HTTP-HDS') or protocol.startswith('HTTP-HLS'):
+ asset_url = self._get_tokenized_src(asset_url, media_id, format_id)
+ if protocol.startswith('HTTP-HDS'):
+ formats.extend(self._extract_f4m_formats(
+ asset_url + ('?' if '?' not in asset_url else '&') + 'hdcore=3.4.0',
+ media_id, f4m_id=format_id, fatal=False))
+ elif protocol.startswith('HTTP-HLS'):
+ formats.extend(self._extract_m3u8_formats(
+ asset_url, media_id, 'mp4', 'm3u8_native',
+ m3u8_id=format_id, fatal=False))
else:
formats.append({
'format_id': format_id,
_TESTS = [{
'url': 'http://www.srf.ch/play/tv/10vor10/video/snowden-beantragt-asyl-in-russland?id=28e1a57d-5b76-4399-8ab3-9097f071e6c5',
- 'md5': '4cd93523723beff51bb4bee974ee238d',
+ 'md5': 'da6b5b3ac9fa4761a942331cef20fcb3',
'info_dict': {
'id': '28e1a57d-5b76-4399-8ab3-9097f071e6c5',
- 'ext': 'm4v',
+ 'ext': 'mp4',
'upload_date': '20130701',
'title': 'Snowden beantragt Asyl in Russland',
'timestamp': 1372713995,
'uploader': '19h30',
'upload_date': '20141201',
'timestamp': 1417458600,
- 'thumbnail': 're:^https?://.*\.image',
+ 'thumbnail': r're:^https?://.*\.image',
'view_count': int,
},
'params': {
'ext': 'mp4',
'title': 'sportarena (26.10.2014)',
'description': 'Ringen: KSV Köllerbach gegen Aachen-Walheim; Frauen-Fußball: 1. FC Saarbrücken gegen Sindelfingen; Motorsport: Rallye in Losheim; dazu: Interview mit Timo Bernhard; Turnen: TG Saar; Reitsport: Deutscher Voltigier-Pokal; Badminton: Interview mit Michael Fuchs ',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
},
'skip': 'no longer available',
}, {
r'(?s)<description>([^<]+)</description>',
coursepage, 'description', fatal=False)
- links = orderedSet(re.findall('<a href="(VideoPage.php\?[^"]+)">', coursepage))
+ links = orderedSet(re.findall(r'<a href="(VideoPage.php\?[^"]+)">', coursepage))
info['entries'] = [self.url_result(
'http://openclassroom.stanford.edu/MainFolder/%s' % unescapeHTML(l)
) for l in links]
rootpage = self._download_webpage(rootURL, info['id'],
errnote='Unable to download course info page')
- links = orderedSet(re.findall('<a href="(CoursePage.php\?[^"]+)">', rootpage))
+ links = orderedSet(re.findall(r'<a href="(CoursePage.php\?[^"]+)">', rootpage))
info['entries'] = [self.url_result(
'http://openclassroom.stanford.edu/MainFolder/%s' % unescapeHTML(l)
) for l in links]
'title': 'Machine Learning Mastery and Cancer Clusters',
'description': 'md5:55163197a44e915a14a1ac3a1de0f2d3',
'duration': 1604,
- 'thumbnail': 're:^https?://.*\.jpg',
+ 'thumbnail': r're:^https?://.*\.jpg',
},
}, {
'url': 'http://www.stitcher.com/podcast/panoply/vulture-tv/e/the-rare-hourlong-comedy-plus-40846275?autoplay=true',
'title': "The CW's 'Crazy Ex-Girlfriend'",
'description': 'md5:04f1e2f98eb3f5cbb094cea0f9e19b17',
'duration': 2235,
- 'thumbnail': 're:^https?://.*\.jpg',
+ 'thumbnail': r're:^https?://.*\.jpg',
},
'params': {
'skip_download': True,
'id': 'dnd1',
'ext': 'mp4',
'title': 'Mikel Oiarzabal scores to make it 0-3 for La Real against Espanyol',
- 'thumbnail': 're:https?://.*\.jpg$',
+ 'thumbnail': r're:https?://.*\.jpg$',
'uploader': 'teabaker',
'timestamp': 1454964157.35115,
'upload_date': '20160208',
'id': 'moo',
'ext': 'mp4',
'title': '"Please don\'t eat me!"',
- 'thumbnail': 're:https?://.*\.jpg$',
+ 'thumbnail': r're:https?://.*\.jpg$',
'timestamp': 1426115495,
'upload_date': '20150311',
'duration': 12,
'ext': 'mp3',
'title': '輸',
'description': 'Crispy脆樂團 - 輸',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'duration': 260,
'upload_date': '20091018',
'uploader': 'Crispy脆樂團',
'ext': 'mp4',
'title': 'md5:0a400058e8105d39e35c35e7c5184164',
'description': 'md5:a31241990e1bd3a64e72ae99afb325fb',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'duration': 302,
'age_limit': 18,
}
'ext': 'mp4',
'title': 'Flygplan till Haile Selassie',
'duration': 3527,
- 'thumbnail': 're:^https?://.*[\.-]jpg$',
+ 'thumbnail': r're:^https?://.*[\.-]jpg$',
'age_limit': 0,
'subtitles': {
'sv': [{
# coding: utf-8
from __future__ import unicode_literals
-import re
-
from .common import InfoExtractor
-from ..utils import parse_duration
+from ..utils import (
+ parse_duration,
+ int_or_none,
+ determine_protocol,
+)
class SWRMediathekIE(InfoExtractor):
'ext': 'mp4',
'title': 'SWR odysso',
'description': 'md5:2012e31baad36162e97ce9eb3f157b8a',
- 'thumbnail': 're:^http:.*\.jpg$',
+ 'thumbnail': r're:^http:.*\.jpg$',
'duration': 2602,
'upload_date': '20140515',
'uploader': 'SWR Fernsehen',
'ext': 'mp4',
'title': 'Nachtcafé - Alltagsdroge Alkohol - zwischen Sektempfang und Komasaufen',
'description': 'md5:e0a3adc17e47db2c23aab9ebc36dbee2',
- 'thumbnail': 're:http://.*\.jpg',
+ 'thumbnail': r're:http://.*\.jpg',
'duration': 5305,
'upload_date': '20140516',
'uploader': 'SWR Fernsehen',
'uploader_id': '990030',
},
+ 'skip': 'redirect to http://swrmediathek.de/index.htm?hinweis=swrlink',
}, {
'url': 'http://swrmediathek.de/player.htm?show=bba23e10-cb93-11e3-bf7f-0026b975f2e6',
'md5': '4382e4ef2c9d7ce6852535fa867a0dd3',
'ext': 'mp3',
'title': 'Saša Stanišic: Vor dem Fest',
'description': 'md5:5b792387dc3fbb171eb709060654e8c9',
- 'thumbnail': 're:http://.*\.jpg',
+ 'thumbnail': r're:http://.*\.jpg',
'duration': 3366,
'upload_date': '20140520',
'uploader': 'SWR 2',
'uploader_id': '284670',
- }
+ },
+ 'skip': 'redirect to http://swrmediathek.de/index.htm?hinweis=swrlink',
}]
def _real_extract(self, url):
- mobj = re.match(self._VALID_URL, url)
- video_id = mobj.group('id')
+ video_id = self._match_id(url)
video = self._download_json(
- 'http://swrmediathek.de/AjaxEntry?ekey=%s' % video_id, video_id, 'Downloading video JSON')
+ 'http://swrmediathek.de/AjaxEntry?ekey=%s' % video_id,
+ video_id, 'Downloading video JSON')
attr = video['attr']
- media_type = attr['entry_etype']
+ title = attr['entry_title']
+ media_type = attr.get('entry_etype')
formats = []
- for entry in video['sub']:
- if entry['name'] != 'entry_media':
+ for entry in video.get('sub', []):
+ if entry.get('name') != 'entry_media':
continue
- entry_attr = entry['attr']
- codec = entry_attr['val0']
- quality = int(entry_attr['val1'])
-
- fmt = {
- 'url': entry_attr['val2'],
- 'quality': quality,
- }
-
- if media_type == 'Video':
- fmt.update({
- 'format_note': ['144p', '288p', '544p', '720p'][quality - 1],
- 'vcodec': codec,
- })
- elif media_type == 'Audio':
- fmt.update({
- 'acodec': codec,
+ entry_attr = entry.get('attr', {})
+ f_url = entry_attr.get('val2')
+ if not f_url:
+ continue
+ codec = entry_attr.get('val0')
+ if codec == 'm3u8':
+ formats.extend(self._extract_m3u8_formats(
+ f_url, video_id, 'mp4', 'm3u8_native',
+ m3u8_id='hls', fatal=False))
+ elif codec == 'f4m':
+ formats.extend(self._extract_f4m_formats(
+ f_url + '?hdcore=3.7.0', video_id,
+ f4m_id='hds', fatal=False))
+ else:
+ formats.append({
+ 'format_id': determine_protocol({'url': f_url}),
+ 'url': f_url,
+ 'quality': int_or_none(entry_attr.get('val1')),
+ 'vcodec': codec if media_type == 'Video' else 'none',
+ 'acodec': codec if media_type == 'Audio' else None,
})
- formats.append(fmt)
-
self._sort_formats(formats)
+ upload_date = None
+ entry_pdatet = attr.get('entry_pdatet')
+ if entry_pdatet:
+ upload_date = entry_pdatet[:-4]
+
return {
'id': video_id,
- 'title': attr['entry_title'],
- 'description': attr['entry_descl'],
- 'thumbnail': attr['entry_image_16_9'],
- 'duration': parse_duration(attr['entry_durat']),
- 'upload_date': attr['entry_pdatet'][:-4],
- 'uploader': attr['channel_title'],
- 'uploader_id': attr['channel_idkey'],
+ 'title': title,
+ 'description': attr.get('entry_descl'),
+ 'thumbnail': attr.get('entry_image_16_9'),
+ 'duration': parse_duration(attr.get('entry_durat')),
+ 'upload_date': upload_date,
+ 'uploader': attr.get('channel_title'),
+ 'uploader_id': attr.get('channel_idkey'),
'formats': formats,
}
'id': '179517',
'ext': 'mp4',
'title': 'Marie Kristin Boese, ARD Berlin, über den zukünftigen Kurs der AfD',
- 'thumbnail': 're:^https?:.*\.jpg$',
+ 'thumbnail': r're:^https?:.*\.jpg$',
'formats': 'mincount:6',
},
}, {
'id': '29417',
'ext': 'mp3',
'title': 'Trabi - Bye, bye Rennpappe',
- 'thumbnail': 're:^https?:.*\.jpg$',
+ 'thumbnail': r're:^https?:.*\.jpg$',
'formats': 'mincount:2',
},
}, {
'ext': 'mp4',
'title': 'Regierungsumbildung in Athen: Neue Minister in Griechenland vereidigt',
'description': '18.07.2015 20:10 Uhr',
- 'thumbnail': 're:^https?:.*\.jpg$',
+ 'thumbnail': r're:^https?:.*\.jpg$',
},
}, {
'url': 'http://www.tagesschau.de/multimedia/sendung/ts-5727.html',
'ext': 'mp4',
'title': 'Sendung: tagesschau \t04.12.2014 20:00 Uhr',
'description': 'md5:695c01bfd98b7e313c501386327aea59',
- 'thumbnail': 're:^https?:.*\.jpg$',
+ 'thumbnail': r're:^https?:.*\.jpg$',
},
}, {
# exclusive audio
'ext': 'mp3',
'title': 'Trabi - Bye, bye Rennpappe',
'description': 'md5:8687dda862cbbe2cfb2df09b56341317',
- 'thumbnail': 're:^https?:.*\.jpg$',
+ 'thumbnail': r're:^https?:.*\.jpg$',
},
}, {
# audio in article
'ext': 'mp3',
'title': 'Viele Baustellen für neuen BND-Chef',
'description': 'md5:1e69a54be3e1255b2b07cdbce5bcd8b4',
- 'thumbnail': 're:^https?:.*\.jpg$',
+ 'thumbnail': r're:^https?:.*\.jpg$',
},
}, {
'url': 'http://www.tagesschau.de/inland/afd-parteitag-135.html',
'ext': 'mp4',
'title': 'Посетителям московского зоопарка показали красную панду',
'description': 'Приехавшую из Дублина Зейну можно увидеть в павильоне "Кошки тропиков"',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
},
},
{
'id': '3453494717001',
'ext': 'mp4',
'title': 'The Gospel by Numbers',
- 'thumbnail': 're:^https?://.*\.jpg',
+ 'thumbnail': r're:^https?://.*\.jpg',
'upload_date': '20140410',
'description': 'Coming soon from T4G 2014!',
'uploader_id': '2034960640001',
'ext': 'mp4',
'title': 'Measures of dispersion from a frequency table',
'description': 'Measures of dispersion from a frequency table',
- 'thumbnail': 're:http://.*\.jpg',
+ 'thumbnail': r're:http://.*\.jpg',
},
}, {
'url': 'http://www.teachertube.com/viewVideo.php?video_id=340064',
'ext': 'mp4',
'title': 'How to Make Paper Dolls _ Paper Art Projects',
'description': 'Learn how to make paper dolls in this simple',
- 'thumbnail': 're:http://.*\.jpg',
+ 'thumbnail': r're:http://.*\.jpg',
},
}, {
'url': 'http://www.teachertube.com/music.php?music_id=8805',
'id': 'tSVI8ta_P4w',
'ext': 'mp4',
'title': 'Vishal Sikka: The beauty and power of algorithms',
- 'thumbnail': 're:^https?://.+\.jpg',
+ 'thumbnail': r're:^https?://.+\.jpg',
'description': 'md5:6261fdfe3e02f4f579cbbfc00aff73f4',
'upload_date': '20140122',
'uploader_id': 'TEDInstitute',
'format_id': '%s-%sk' % (format_id, bitrate),
'tbr': bitrate,
})
- if re.search('\d+k', h264_url):
+ if re.search(r'\d+k', h264_url):
http_url = h264_url
elif format_id == 'rtmp':
streamer = talk_info.get('streamer')
class TeleBruxellesIE(InfoExtractor):
- _VALID_URL = r'https?://(?:www\.)?(?:telebruxelles|bx1)\.be/(news|sport|dernier-jt)/?(?P<id>[^/#?]+)'
+ _VALID_URL = r'https?://(?:www\.)?(?:telebruxelles|bx1)\.be/(news|sport|dernier-jt|emission)/?(?P<id>[^/#?]+)'
_TESTS = [{
- 'url': 'http://www.telebruxelles.be/news/auditions-devant-parlement-francken-galant-tres-attendus/',
- 'md5': '59439e568c9ee42fb77588b2096b214f',
+ 'url': 'http://bx1.be/news/que-risque-lauteur-dune-fausse-alerte-a-la-bombe/',
+ 'md5': 'a2a67a5b1c3e8c9d33109b902f474fd9',
'info_dict': {
- 'id': '11942',
- 'display_id': 'auditions-devant-parlement-francken-galant-tres-attendus',
- 'ext': 'flv',
- 'title': 'Parlement : Francken et Galant répondent aux interpellations de l’opposition',
- 'description': 're:Les auditions des ministres se poursuivent*'
- },
- 'params': {
- 'skip_download': 'requires rtmpdump'
+ 'id': '158856',
+ 'display_id': 'que-risque-lauteur-dune-fausse-alerte-a-la-bombe',
+ 'ext': 'mp4',
+ 'title': 'Que risque l’auteur d’une fausse alerte à la bombe ?',
+ 'description': 'md5:3cf8df235d44ebc5426373050840e466',
},
}, {
- 'url': 'http://www.telebruxelles.be/sport/basket-brussels-bat-mons-80-74/',
- 'md5': '181d3fbdcf20b909309e5aef5c6c6047',
+ 'url': 'http://bx1.be/sport/futsal-schaerbeek-sincline-5-3-a-thulin/',
+ 'md5': 'dfe07ecc9c153ceba8582ac912687675',
'info_dict': {
- 'id': '10091',
- 'display_id': 'basket-brussels-bat-mons-80-74',
- 'ext': 'flv',
- 'title': 'Basket : le Brussels bat Mons 80-74',
- 'description': 're:^Ils l\u2019on fait ! En basket, le B*',
- },
- 'params': {
- 'skip_download': 'requires rtmpdump'
+ 'id': '158433',
+ 'display_id': 'futsal-schaerbeek-sincline-5-3-a-thulin',
+ 'ext': 'mp4',
+ 'title': 'Futsal : Schaerbeek s’incline 5-3 à Thulin',
+ 'description': 'md5:fd013f1488d5e2dceb9cebe39e2d569b',
},
+ }, {
+ 'url': 'http://bx1.be/emission/bxenf1-gastronomie/',
+ 'only_matching': True,
}]
def _real_extract(self, url):
r'file\s*:\s*"(rtmp://[^/]+/vod/mp4:"\s*\+\s*"[^"]+"\s*\+\s*".mp4)"',
webpage, 'RTMP url')
rtmp_url = re.sub(r'"\s*\+\s*"', '', rtmp_url)
+ formats = self._extract_wowza_formats(rtmp_url, article_id or display_id)
+ self._sort_formats(formats)
return {
'id': article_id or display_id,
'display_id': display_id,
'title': title,
'description': description,
- 'url': rtmp_url,
- 'ext': 'flv',
- 'rtmp_live': True # if rtmpdump is not called with "--live" argument, the download is blocked and can be completed
+ 'formats': formats,
}
'ext': 'mp4',
'title': 'Tikibad ontruimd wegens brand',
'description': 'md5:05ca046ff47b931f9b04855015e163a4',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'duration': 33,
},
'params': {
'ext': 'mp4',
'title': 'Mons - Cook with Danielle : des cours de cuisine en anglais ! - Les reportages',
'description': 'md5:bc5225f47b17c309761c856ad4776265',
- 'thumbnail': 're:^http://.*\.(?:jpg|png)$',
+ 'thumbnail': r're:^http://.*\.(?:jpg|png)$',
}
},
{
'ext': 'mp4',
'title': 'Havré - Incendie mortel - Les reportages',
'description': 'md5:5e54cb449acb029c2b7734e2d946bd4a',
- 'thumbnail': 're:^http://.*\.(?:jpg|png)$',
+ 'thumbnail': r're:^http://.*\.(?:jpg|png)$',
}
},
]
'id': '1263668',
'ext': 'mp4',
'title': 'قرعه\u200cکشی لیگ قهرمانان اروپا',
- 'thumbnail': 're:^https?://.*\.jpg',
+ 'thumbnail': r're:^https?://.*\.jpg',
'view_count': int,
},
'params': {
class ThePlatformBaseIE(OnceIE):
def _extract_theplatform_smil(self, smil_url, video_id, note='Downloading SMIL data'):
- meta = self._download_xml(smil_url, video_id, note=note, query={'format': 'SMIL'})
+ meta = self._download_xml(
+ smil_url, video_id, note=note, query={'format': 'SMIL'},
+ headers=self.geo_verification_headers())
error_element = find_xpath_attr(meta, _x('.//smil:ref'), 'src')
if error_element is not None and error_element.attrib['src'].startswith(
'http://link.theplatform.com/s/errorFiles/Unavailable.'):
'title': 'iPhone Siri’s sassy response to a math question has people talking',
'description': 'md5:a565d1deadd5086f3331d57298ec6333',
'duration': 83.0,
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'timestamp': 1435752600,
'upload_date': '20150701',
'uploader': 'NBCU-NEWS',
'ext': 'mp4',
'title': 'The Biden factor: will Joe run in 2016?',
'description': 'Could Vice President Joe Biden be preparing a 2016 campaign? Mark Halperin and Sam Stein weigh in.',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'upload_date': '20140208',
'timestamp': 1391824260,
'duration': 467.0,
'ext': 'm4a',
'title': '487: Harper High School, Part One',
'description': 'md5:ee40bdf3fb96174a9027f76dbecea655',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
},
}, {
'url': 'http://www.thisamericanlife.org/play_full.php?play=487',
class ThisOldHouseIE(InfoExtractor):
- _VALID_URL = r'https?://(?:www\.)?thisoldhouse\.com/(?:watch|how-to)/(?P<id>[^/?#]+)'
+ _VALID_URL = r'https?://(?:www\.)?thisoldhouse\.com/(?:watch|how-to|tv-episode)/(?P<id>[^/?#]+)'
_TESTS = [{
'url': 'https://www.thisoldhouse.com/how-to/how-to-build-storage-bench',
- 'md5': '568acf9ca25a639f0c4ff905826b662f',
+ 'md5': '946f05bbaa12a33f9ae35580d2dfcfe3',
'info_dict': {
'id': '2REGtUDQ',
'ext': 'mp4',
}, {
'url': 'https://www.thisoldhouse.com/watch/arlington-arts-crafts-arts-and-crafts-class-begins',
'only_matching': True,
+ }, {
+ 'url': 'https://www.thisoldhouse.com/tv-episode/ask-toh-shelf-rough-electric',
+ 'only_matching': True,
}]
def _real_extract(self, url):
webpage = self._download_webpage(url, video_id, 'Downloading page')
mobj = re.search(r'(?m)fo\.addVariable\("file",\s"(?P<fileid>[\da-z]+)"\);\n'
- '\s+fo\.addVariable\("s",\s"(?P<serverid>\d+)"\);', webpage)
+ r'\s+fo\.addVariable\("s",\s"(?P<serverid>\d+)"\);', webpage)
if mobj is None:
raise ExtractorError('Video %s does not exist' % video_id, expected=True)
formats = []
def extract_video_url(vl):
- return re.sub('speed=\d+', 'speed=', unescapeHTML(vl.text))
+ return re.sub(r'speed=\d+', 'speed=', unescapeHTML(vl.text))
video_link = cfg_xml.find('./videoLink')
if video_link is not None:
'display_id': '6538',
'ext': 'mp4',
'title': 'Educational xxx video',
- 'thumbnail': 're:https?://.*\.jpg$',
+ 'thumbnail': r're:https?://.*\.jpg$',
'age_limit': 18,
},
'params': {
'display_id': 'Carmella-Decesare-striptease',
'ext': 'mp4',
'title': 'Carmella Decesare - striptease',
- 'thumbnail': 're:https?://.*\.jpg$',
+ 'thumbnail': r're:https?://.*\.jpg$',
'duration': 91,
'age_limit': 18,
'categories': ['Porn Stars'],
'ext': 'mp4',
'title': 'Educational xxx video',
'description': 'md5:b4fab8f88a8621c8fabd361a173fe5b8',
- 'thumbnail': 're:https?://.*\.jpg$',
+ 'thumbnail': r're:https?://.*\.jpg$',
'duration': 164,
'age_limit': 18,
'uploader': 'bobwhite39',
'ext': 'mp4',
'title': 'Amateur Finger Fuck',
'description': 'Amateur solo finger fucking.',
- 'thumbnail': 're:https?://.*\.jpg$',
+ 'thumbnail': r're:https?://.*\.jpg$',
'duration': 83,
'age_limit': 18,
'uploader': 'cwbike',
'ext': 'mp4',
'title': 'Experienced MILF Amazing Handjob',
'description': 'Experienced MILF giving an Amazing Handjob',
- 'thumbnail': 're:https?://.*\.jpg$',
+ 'thumbnail': r're:https?://.*\.jpg$',
'age_limit': 18,
'uploader': 'darvinfred06',
'view_count': int,
'ext': 'flv',
'title': 'Jeune Couple Russe',
'description': 'Amateur',
- 'thumbnail': 're:https?://.*\.jpg$',
+ 'thumbnail': r're:https?://.*\.jpg$',
'age_limit': 18,
'uploader': 'whiskeyjar',
'view_count': int,
'id': '159448201',
'ext': 'f4v',
'title': '卡马乔国足开大脚长传冲吊集锦',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'timestamp': 1372113489000,
'description': '卡马乔卡家军,开大脚先进战术不完全集锦!',
'duration': 289.04,
'id': '117049447',
'ext': 'f4v',
'title': 'La Sylphide-Bolshoi-Ekaterina Krysanova & Vyacheslav Lopatin 2012',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'timestamp': 1349207518000,
'description': 'md5:294612423894260f2dcd5c6c04fe248b',
'duration': 5478.33,
'ext': 'mp4',
'title': 'tatiana maslany news, Orphan Black || DVD extra - behind the scenes ↳...',
'description': 'md5:37db8211e40b50c7c44e95da14f630b7',
- 'thumbnail': 're:http://.*\.jpg',
+ 'thumbnail': r're:http://.*\.jpg',
}
}, {
'url': 'http://5sostrum.tumblr.com/post/90208453769/yall-forgetting-the-greatest-keek-of-them-all',
'ext': 'mp4',
'title': '5SOS STRUM ;]',
'description': 'md5:dba62ac8639482759c8eb10ce474586a',
- 'thumbnail': 're:http://.*\.jpg',
+ 'thumbnail': r're:http://.*\.jpg',
}
}, {
'url': 'http://hdvideotest.tumblr.com/post/130323439814/test-description-for-my-hd-video',
'ext': 'mp4',
'title': 'HD Video Testing \u2014 Test description for my HD video',
'description': 'md5:97cc3ab5fcd27ee4af6356701541319c',
- 'thumbnail': 're:http://.*\.jpg',
+ 'thumbnail': r're:http://.*\.jpg',
},
'params': {
'format': 'hd',
'title': 'Video by victoriassecret',
'description': 'Invisibility or flight…which superpower would YOU choose? #VSFashionShow #ThisOrThat',
'uploader_id': 'victoriassecret',
- 'thumbnail': 're:^https?://.*\.jpg'
+ 'thumbnail': r're:^https?://.*\.jpg'
},
'add_ie': ['Instagram'],
}]
class TuneInBaseIE(InfoExtractor):
_API_BASE_URL = 'http://tunein.com/tuner/tune/'
+ @staticmethod
+ def _extract_urls(webpage):
+ return re.findall(
+ r'<iframe[^>]+src=["\'](?P<url>(?:https?://)?tunein\.com/embed/player/[pst]\d+)',
+ webpage)
+
def _real_extract(self, url):
content_id = self._match_id(url)
_VALID_URL = r'https?://(?:www\.)?tunein\.com/station/.*?audioClipId\=(?P<id>\d+)'
_API_URL_QUERY = '?tuneType=AudioClip&audioclipId=%s'
- _TESTS = [
- {
- 'url': 'http://tunein.com/station/?stationId=246119&audioClipId=816',
- 'md5': '99f00d772db70efc804385c6b47f4e77',
- 'info_dict': {
- 'id': '816',
- 'title': '32m',
- 'ext': 'mp3',
- },
+ _TESTS = [{
+ 'url': 'http://tunein.com/station/?stationId=246119&audioClipId=816',
+ 'md5': '99f00d772db70efc804385c6b47f4e77',
+ 'info_dict': {
+ 'id': '816',
+ 'title': '32m',
+ 'ext': 'mp3',
},
- ]
+ }]
class TuneInStationIE(TuneInBaseIE):
IE_NAME = 'tunein:station'
- _VALID_URL = r'https?://(?:www\.)?tunein\.com/(?:radio/.*?-s|station/.*?StationId\=)(?P<id>\d+)'
+ _VALID_URL = r'https?://(?:www\.)?tunein\.com/(?:radio/.*?-s|station/.*?StationId=|embed/player/s)(?P<id>\d+)'
_API_URL_QUERY = '?tuneType=Station&stationId=%s'
@classmethod
def suitable(cls, url):
return False if TuneInClipIE.suitable(url) else super(TuneInStationIE, cls).suitable(url)
- _TESTS = [
- {
- 'url': 'http://tunein.com/radio/Jazz24-885-s34682/',
- 'info_dict': {
- 'id': '34682',
- 'title': 'Jazz 24 on 88.5 Jazz24 - KPLU-HD2',
- 'ext': 'mp3',
- 'location': 'Tacoma, WA',
- },
- 'params': {
- 'skip_download': True, # live stream
- },
+ _TESTS = [{
+ 'url': 'http://tunein.com/radio/Jazz24-885-s34682/',
+ 'info_dict': {
+ 'id': '34682',
+ 'title': 'Jazz 24 on 88.5 Jazz24 - KPLU-HD2',
+ 'ext': 'mp3',
+ 'location': 'Tacoma, WA',
+ },
+ 'params': {
+ 'skip_download': True, # live stream
},
- ]
+ }, {
+ 'url': 'http://tunein.com/embed/player/s6404/',
+ 'only_matching': True,
+ }]
class TuneInProgramIE(TuneInBaseIE):
IE_NAME = 'tunein:program'
- _VALID_URL = r'https?://(?:www\.)?tunein\.com/(?:radio/.*?-p|program/.*?ProgramId\=)(?P<id>\d+)'
+ _VALID_URL = r'https?://(?:www\.)?tunein\.com/(?:radio/.*?-p|program/.*?ProgramId=|embed/player/p)(?P<id>\d+)'
_API_URL_QUERY = '?tuneType=Program&programId=%s'
- _TESTS = [
- {
- 'url': 'http://tunein.com/radio/Jazz-24-p2506/',
- 'info_dict': {
- 'id': '2506',
- 'title': 'Jazz 24 on 91.3 WUKY-HD3',
- 'ext': 'mp3',
- 'location': 'Lexington, KY',
- },
- 'params': {
- 'skip_download': True, # live stream
- },
+ _TESTS = [{
+ 'url': 'http://tunein.com/radio/Jazz-24-p2506/',
+ 'info_dict': {
+ 'id': '2506',
+ 'title': 'Jazz 24 on 91.3 WUKY-HD3',
+ 'ext': 'mp3',
+ 'location': 'Lexington, KY',
},
- ]
+ 'params': {
+ 'skip_download': True, # live stream
+ },
+ }, {
+ 'url': 'http://tunein.com/embed/player/p191660/',
+ 'only_matching': True,
+ }]
class TuneInTopicIE(TuneInBaseIE):
IE_NAME = 'tunein:topic'
- _VALID_URL = r'https?://(?:www\.)?tunein\.com/topic/.*?TopicId\=(?P<id>\d+)'
+ _VALID_URL = r'https?://(?:www\.)?tunein\.com/(?:topic/.*?TopicId=|embed/player/t)(?P<id>\d+)'
_API_URL_QUERY = '?tuneType=Topic&topicId=%s'
- _TESTS = [
- {
- 'url': 'http://tunein.com/topic/?TopicId=101830576',
- 'md5': 'c31a39e6f988d188252eae7af0ef09c9',
- 'info_dict': {
- 'id': '101830576',
- 'title': 'Votez pour moi du 29 octobre 2015 (29/10/15)',
- 'ext': 'mp3',
- 'location': 'Belgium',
- },
+ _TESTS = [{
+ 'url': 'http://tunein.com/topic/?TopicId=101830576',
+ 'md5': 'c31a39e6f988d188252eae7af0ef09c9',
+ 'info_dict': {
+ 'id': '101830576',
+ 'title': 'Votez pour moi du 29 octobre 2015 (29/10/15)',
+ 'ext': 'mp3',
+ 'location': 'Belgium',
},
- ]
+ }, {
+ 'url': 'http://tunein.com/embed/player/t101830576/',
+ 'only_matching': True,
+ }]
class TuneInShortenerIE(InfoExtractor):
'duration': 3715,
'title': 'Turbo du 07/09/2014 : Renault Twingo 3, Bentley Continental GT Speed, CES, Guide Achat Dacia... ',
'description': 'Turbo du 07/09/2014 : Renault Twingo 3, Bentley Continental GT Speed, CES, Guide Achat Dacia...',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
}
}
formats.extend(self._extract_smil_formats(
video_url, video_id, fatal=False))
elif ext == 'm3u8':
- formats.extend(self._extract_m3u8_formats(
+ m3u8_formats = self._extract_m3u8_formats(
video_url, video_id, 'mp4',
- m3u8_id=format_id or 'hls', fatal=False))
+ m3u8_id=format_id or 'hls', fatal=False)
+ if '/secure/' in video_url and '?hdnea=' in video_url:
+ for f in m3u8_formats:
+ f['_seekable'] = False
+ formats.extend(m3u8_formats)
elif ext == 'f4m':
formats.extend(self._extract_f4m_formats(
update_url_query(video_url, {'hdcore': '3.7.0'}),
if not assets:
# New embed pattern
- for v in re.findall('TV2ContentboxVideo\(({.+?})\)', webpage):
+ for v in re.findall(r'TV2ContentboxVideo\(({.+?})\)', webpage):
video = self._parse_json(
v, playlist_id, transform_source=js_to_json, fatal=False)
if not video:
from .common import InfoExtractor
from ..compat import compat_str
from ..utils import (
- ExtractorError,
int_or_none,
parse_iso8601,
try_get,
- update_url_query,
+ determine_ext,
)
_TESTS = [
{
'url': 'http://www.tv4.se/kalla-fakta/klipp/kalla-fakta-5-english-subtitles-2491650',
- 'md5': '909d6454b87b10a25aa04c4bdd416a9b',
+ 'md5': 'cb837212f342d77cec06e6dad190e96d',
'info_dict': {
'id': '2491650',
'ext': 'mp4',
'title': 'Kalla Fakta 5 (english subtitles)',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'timestamp': int,
'upload_date': '20131125',
},
},
{
'url': 'http://www.tv4play.se/iframe/video/3054113',
- 'md5': '77f851c55139ffe0ebd41b6a5552489b',
+ 'md5': 'cb837212f342d77cec06e6dad190e96d',
'info_dict': {
'id': '3054113',
'ext': 'mp4',
'title': 'Så här jobbar ficktjuvarna - se avslöjande bilder',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'description': 'Unika bilder avslöjar hur turisternas fickor vittjas mitt på Stockholms central. Två experter på ficktjuvarna avslöjar knepen du ska se upp för.',
'timestamp': int,
'upload_date': '20150130',
# If is_geo_restricted is true, it doesn't necessarily mean we can't download it
if info.get('is_geo_restricted'):
self.report_warning('This content might not be available in your country due to licensing restrictions.')
- if info.get('requires_subscription'):
- raise ExtractorError('This content requires subscription.', expected=True)
title = info['title']
+ subtitles = {}
formats = []
# http formats are linked with unresolvable host
for kind in ('hls', ''):
'https://prima.tv4play.se/api/web/asset/%s/play.json' % video_id,
video_id, 'Downloading sources JSON', query={
'protocol': kind,
- 'videoFormat': 'MP4+WEBVTTS+WEBVTT',
+ 'videoFormat': 'MP4+WEBVTT',
})
- item = try_get(data, lambda x: x['playback']['items']['item'], dict)
- manifest_url = item.get('url')
- if not isinstance(manifest_url, compat_str):
+ items = try_get(data, lambda x: x['playback']['items']['item'])
+ if not items:
continue
- if kind == 'hls':
- formats.extend(self._extract_m3u8_formats(
- manifest_url, video_id, 'mp4', entry_protocol='m3u8_native',
- m3u8_id=kind, fatal=False))
- else:
- formats.extend(self._extract_f4m_formats(
- update_url_query(manifest_url, {'hdcore': '3.8.0'}),
- video_id, f4m_id='hds', fatal=False))
+ if isinstance(items, dict):
+ items = [items]
+ for item in items:
+ manifest_url = item.get('url')
+ if not isinstance(manifest_url, compat_str):
+ continue
+ ext = determine_ext(manifest_url)
+ if ext == 'm3u8':
+ formats.extend(self._extract_m3u8_formats(
+ manifest_url, video_id, 'mp4', entry_protocol='m3u8_native',
+ m3u8_id=kind, fatal=False))
+ elif ext == 'f4m':
+ formats.extend(self._extract_akamai_formats(
+ manifest_url, video_id, {
+ 'hls': 'tv4play-i.akamaihd.net',
+ }))
+ elif ext == 'webvtt':
+ subtitles = self._merge_subtitles(
+ subtitles, {
+ 'sv': [{
+ 'url': manifest_url,
+ 'ext': 'vtt',
+ }]})
self._sort_formats(formats)
return {
'id': video_id,
'title': title,
'formats': formats,
+ 'subtitles': subtitles,
'description': info.get('description'),
'timestamp': parse_iso8601(info.get('broadcast_date_time')),
'duration': int_or_none(info.get('duration')),
--- /dev/null
+# coding: utf-8
+from __future__ import unicode_literals
+
+from .common import InfoExtractor
+from ..utils import (
+ int_or_none,
+ parse_iso8601,
+ smuggle_url,
+)
+
+
+class TVAIE(InfoExtractor):
+ _VALID_URL = r'https?://videos\.tva\.ca/episode/(?P<id>\d+)'
+ _TEST = {
+ 'url': 'http://videos.tva.ca/episode/85538',
+ 'info_dict': {
+ 'id': '85538',
+ 'ext': 'mp4',
+ 'title': 'Épisode du 25 janvier 2017',
+ 'description': 'md5:e9e7fb5532ab37984d2dc87229cadf98',
+ 'upload_date': '20170126',
+ 'timestamp': 1485442329,
+ },
+ 'params': {
+ # m3u8 download
+ 'skip_download': True,
+ }
+ }
+
+ def _real_extract(self, url):
+ video_id = self._match_id(url)
+ video_data = self._download_json(
+ "https://d18jmrhziuoi7p.cloudfront.net/isl/api/v1/dataservice/Items('%s')" % video_id,
+ video_id, query={
+ '$expand': 'Metadata,CustomId',
+ '$select': 'Metadata,Id,Title,ShortDescription,LongDescription,CreatedDate,CustomId,AverageUserRating,Categories,ShowName',
+ '$format': 'json',
+ })
+ metadata = video_data.get('Metadata', {})
+
+ return {
+ '_type': 'url_transparent',
+ 'id': video_id,
+ 'title': video_data['Title'],
+ 'url': smuggle_url('ooyala:' + video_data['CustomId'], {'supportedformats': 'm3u8,hds'}),
+ 'description': video_data.get('LongDescription') or video_data.get('ShortDescription'),
+ 'series': video_data.get('ShowName'),
+ 'episode': metadata.get('EpisodeTitle'),
+ 'episode_number': int_or_none(metadata.get('EpisodeNumber')),
+ 'categories': video_data.get('Categories'),
+ 'average_rating': video_data.get('AverageUserRating'),
+ 'timestamp': parse_iso8601(video_data.get('CreatedDate')),
+ 'ie_key': 'Ooyala',
+ }
'id': '74622',
'ext': 'mp4',
'title': 'События. "События". Эфир от 22.05.2015 14:30',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'duration': 1122,
},
}
'ext': 'mp4',
'title': 'События. "События". Эфир от 22.05.2015 14:30',
'description': 'md5:ad7aa7db22903f983e687b8a3e98c6dd',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'duration': 1122,
},
}, {
'ext': 'mp4',
'title': 'Эксперты: в столице встал вопрос о максимально безопасных остановках',
'description': 'md5:f2098f71e21f309e89f69b525fd9846e',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'duration': 278,
},
}, {
'ext': 'mp4',
'title': 'Ещё не поздно. Эфир от 03.08.2013',
'description': 'md5:51fae9f3f8cfe67abce014e428e5b027',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'duration': 3316,
},
}]
'ext': 'mp4',
'title': 'New Nintendo 3DS XL - Op alle fronten beter',
'description': 'md5:3789b21fed9c0219e9bcaacd43fab280',
- 'thumbnail': 're:^https?://.*\.jpe?g$',
+ 'thumbnail': r're:^https?://.*\.jpe?g$',
'duration': 386,
'uploader_id': 's7JeEm',
}
class TwentyFourVideoIE(InfoExtractor):
IE_NAME = '24video'
- _VALID_URL = r'https?://(?:www\.)?24video\.(?:net|me|xxx)/(?:video/(?:view|xml)/|player/new24_play\.swf\?id=)(?P<id>\d+)'
+ _VALID_URL = r'https?://(?:www\.)?24video\.(?:net|me|xxx|sex)/(?:video/(?:view|xml)/|player/new24_play\.swf\?id=)(?P<id>\d+)'
_TESTS = [{
'url': 'http://www.24video.net/video/view/1044982',
'ext': 'mp4',
'title': 'Эротика каменного века',
'description': 'Как смотрели порно в каменном веке.',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'uploader': 'SUPERTELO',
'duration': 31,
'timestamp': 1275937857,
video_id = self._match_id(url)
webpage = self._download_webpage(
- 'http://www.24video.net/video/view/%s' % video_id, video_id)
+ 'http://www.24video.sex/video/view/%s' % video_id, video_id)
title = self._og_search_title(webpage)
description = self._html_search_regex(
# Sets some cookies
self._download_xml(
- r'http://www.24video.net/video/xml/%s?mode=init' % video_id,
+ r'http://www.24video.sex/video/xml/%s?mode=init' % video_id,
video_id, 'Downloading init XML')
video_xml = self._download_xml(
- 'http://www.24video.net/video/xml/%s?mode=play' % video_id,
+ 'http://www.24video.sex/video/xml/%s?mode=play' % video_id,
video_id, 'Downloading video XML')
video = xpath_element(video_xml, './/video', 'video', fatal=True)
import re
from .common import InfoExtractor
-from ..utils import remove_end
+from ..utils import (
+ int_or_none,
+ try_get,
+)
class TwentyMinutenIE(InfoExtractor):
IE_NAME = '20min'
- _VALID_URL = r'https?://(?:www\.)?20min\.ch/(?:videotv/*\?.*\bvid=(?P<id>\d+)|(?:[^/]+/)*(?P<display_id>[^/#?]+))'
+ _VALID_URL = r'''(?x)
+ https?://
+ (?:www\.)?20min\.ch/
+ (?:
+ videotv/*\?.*?\bvid=|
+ videoplayer/videoplayer\.html\?.*?\bvideoId@
+ )
+ (?P<id>\d+)
+ '''
_TESTS = [{
- # regular video
'url': 'http://www.20min.ch/videotv/?vid=469148&cid=2',
- 'md5': 'b52d6bc6ea6398e6a38f12cfd418149c',
+ 'md5': 'e7264320db31eed8c38364150c12496e',
'info_dict': {
'id': '469148',
- 'ext': 'flv',
+ 'ext': 'mp4',
'title': '85 000 Franken für 15 perfekte Minuten',
- 'description': 'Was die Besucher vom Silvesterzauber erwarten können. (Video: Alice Grosjean/Murat Temel)',
- 'thumbnail': 'http://thumbnails.20min-tv.ch/server063/469148/frame-72-469148.jpg'
- }
- }, {
- # news article with video
- 'url': 'http://www.20min.ch/schweiz/news/story/-Wir-muessen-mutig-nach-vorne-schauen--22050469',
- 'md5': 'cd4cbb99b94130cff423e967cd275e5e',
- 'info_dict': {
- 'id': '469408',
- 'display_id': '-Wir-muessen-mutig-nach-vorne-schauen--22050469',
- 'ext': 'flv',
- 'title': '«Wir müssen mutig nach vorne schauen»',
- 'description': 'Kein Land sei innovativer als die Schweiz, sagte Johann Schneider-Ammann in seiner Neujahrsansprache. Das Land müsse aber seine Hausaufgaben machen.',
- 'thumbnail': 'http://www.20min.ch/images/content/2/2/0/22050469/10/teaserbreit.jpg'
+ 'thumbnail': r're:https?://.*\.jpg$',
},
- 'skip': '"This video is no longer available" is shown both on the web page and in the downloaded file.',
}, {
- # YouTube embed
- 'url': 'http://www.20min.ch/ro/sports/football/story/Il-marque-une-bicyclette-de-plus-de-30-metres--21115184',
- 'md5': 'cec64d59aa01c0ed9dbba9cf639dd82f',
+ 'url': 'http://www.20min.ch/videoplayer/videoplayer.html?params=client@twentyDE|videoId@523629',
'info_dict': {
- 'id': 'ivM7A7SpDOs',
+ 'id': '523629',
'ext': 'mp4',
- 'title': 'GOLAZO DE CHILENA DE JAVI GÓMEZ, FINALISTA AL BALÓN DE CLM 2016',
- 'description': 'md5:903c92fbf2b2f66c09de514bc25e9f5a',
- 'upload_date': '20160424',
- 'uploader': 'RTVCM Castilla-La Mancha',
- 'uploader_id': 'RTVCM',
+ 'title': 'So kommen Sie bei Eis und Schnee sicher an',
+ 'description': 'md5:117c212f64b25e3d95747e5276863f7d',
+ 'thumbnail': r're:https?://.*\.jpg$',
+ },
+ 'params': {
+ 'skip_download': True,
},
- 'add_ie': ['Youtube'],
}, {
'url': 'http://www.20min.ch/videotv/?cid=44&vid=468738',
'only_matching': True,
- }, {
- 'url': 'http://www.20min.ch/ro/sortir/cinema/story/Grandir-au-bahut--c-est-dur-18927411',
- 'only_matching': True,
}]
+ @staticmethod
+ def _extract_urls(webpage):
+ return [m.group('url') for m in re.finditer(
+ r'<iframe[^>]+src=(["\'])(?P<url>(?:https?://)?(?:www\.)?20min\.ch/videoplayer/videoplayer.html\?.*?\bvideoId@\d+.*?)\1',
+ webpage)]
+
def _real_extract(self, url):
- mobj = re.match(self._VALID_URL, url)
- video_id = mobj.group('id')
- display_id = mobj.group('display_id') or video_id
+ video_id = self._match_id(url)
+
+ video = self._download_json(
+ 'http://api.20min.ch/video/%s/show' % video_id,
+ video_id)['content']
- webpage = self._download_webpage(url, display_id)
+ title = video['title']
- youtube_url = self._html_search_regex(
- r'<iframe[^>]+src="((?:https?:)?//www\.youtube\.com/embed/[^"]+)"',
- webpage, 'YouTube embed URL', default=None)
- if youtube_url is not None:
- return self.url_result(youtube_url, 'Youtube')
+ formats = [{
+ 'format_id': format_id,
+ 'url': 'http://podcast.20min-tv.ch/podcast/20min/%s%s.mp4' % (video_id, p),
+ 'quality': quality,
+ } for quality, (format_id, p) in enumerate([('sd', ''), ('hd', 'h')])]
+ self._sort_formats(formats)
- title = self._html_search_regex(
- r'<h1>.*?<span>(.+?)</span></h1>',
- webpage, 'title', default=None)
- if not title:
- title = remove_end(re.sub(
- r'^20 [Mm]inuten.*? -', '', self._og_search_title(webpage)), ' - News')
+ description = video.get('lead')
+ thumbnail = video.get('thumbnail')
- if not video_id:
- video_id = self._search_regex(
- r'"file\d?"\s*,\s*\"(\d+)', webpage, 'video id')
+ def extract_count(kind):
+ return try_get(
+ video,
+ lambda x: int_or_none(x['communityobject']['thumbs_%s' % kind]))
- description = self._html_search_meta(
- 'description', webpage, 'description')
- thumbnail = self._og_search_thumbnail(webpage)
+ like_count = extract_count('up')
+ dislike_count = extract_count('down')
return {
'id': video_id,
- 'display_id': display_id,
- 'url': 'http://speed.20min-tv.ch/%sm.flv' % video_id,
'title': title,
'description': description,
'thumbnail': thumbnail,
+ 'like_count': like_count,
+ 'dislike_count': dislike_count,
+ 'formats': formats,
}
orderedSet,
parse_duration,
parse_iso8601,
+ update_url_query,
urlencode_postdata,
)
class TwitchVodIE(TwitchItemBaseIE):
IE_NAME = 'twitch:vod'
- _VALID_URL = r'%s/[^/]+/v/(?P<id>\d+)' % TwitchBaseIE._VALID_URL_BASE
+ _VALID_URL = r'''(?x)
+ https?://
+ (?:
+ (?:www\.)?twitch\.tv/(?:[^/]+/v|videos)/|
+ player\.twitch\.tv/\?.*?\bvideo=v
+ )
+ (?P<id>\d+)
+ '''
_ITEM_TYPE = 'vod'
_ITEM_SHORTCUT = 'v'
'id': 'v6528877',
'ext': 'mp4',
'title': 'LCK Summer Split - Week 6 Day 1',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'duration': 17208,
'timestamp': 1435131709,
'upload_date': '20150624',
'id': 'v11230755',
'ext': 'mp4',
'title': 'Untitled Broadcast',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'duration': 1638,
'timestamp': 1439746708,
'upload_date': '20150816',
'skip_download': True,
},
'skip': 'HTTP Error 404: Not Found',
+ }, {
+ 'url': 'http://player.twitch.tv/?t=5m10s&video=v6528877',
+ 'only_matching': True,
+ }, {
+ 'url': 'https://www.twitch.tv/videos/6528877',
+ 'only_matching': True,
}]
def _real_extract(self, url):
if 't' in query:
info['start_time'] = parse_duration(query['t'][0])
+ if info.get('timestamp') is not None:
+ info['subtitles'] = {
+ 'rechat': [{
+ 'url': update_url_query(
+ 'https://rechat.twitch.tv/rechat-messages', {
+ 'video_id': 'v%s' % item_id,
+ 'start': info['timestamp'],
+ }),
+ 'ext': 'json',
+ }],
+ }
+
return info
response = self._call_api(
self._PLAYLIST_PATH % (channel_id, offset, limit),
channel_id,
- 'Downloading %s videos JSON page %s'
+ 'Downloading %s JSON page %s'
% (self._PLAYLIST_TYPE, counter_override or counter))
page_entries = self._extract_playlist_page(response)
if not page_entries:
}
-class TwitchPastBroadcastsIE(TwitchPlaylistBaseIE):
- IE_NAME = 'twitch:past_broadcasts'
- _VALID_URL = r'%s/(?P<id>[^/]+)/profile/past_broadcasts/?(?:\#.*)?$' % TwitchBaseIE._VALID_URL_BASE
- _PLAYLIST_PATH = TwitchPlaylistBaseIE._PLAYLIST_PATH + '&broadcasts=true'
+class TwitchVideosBaseIE(TwitchPlaylistBaseIE):
+ _VALID_URL_VIDEOS_BASE = r'%s/(?P<id>[^/]+)/videos' % TwitchBaseIE._VALID_URL_BASE
+ _PLAYLIST_PATH = TwitchPlaylistBaseIE._PLAYLIST_PATH + '&broadcast_type='
+
+
+class TwitchAllVideosIE(TwitchVideosBaseIE):
+ IE_NAME = 'twitch:videos:all'
+ _VALID_URL = r'%s/all' % TwitchVideosBaseIE._VALID_URL_VIDEOS_BASE
+ _PLAYLIST_PATH = TwitchVideosBaseIE._PLAYLIST_PATH + 'archive,upload,highlight'
+ _PLAYLIST_TYPE = 'all videos'
+
+ _TEST = {
+ 'url': 'https://www.twitch.tv/spamfish/videos/all',
+ 'info_dict': {
+ 'id': 'spamfish',
+ 'title': 'Spamfish',
+ },
+ 'playlist_mincount': 869,
+ }
+
+
+class TwitchUploadsIE(TwitchVideosBaseIE):
+ IE_NAME = 'twitch:videos:uploads'
+ _VALID_URL = r'%s/uploads' % TwitchVideosBaseIE._VALID_URL_VIDEOS_BASE
+ _PLAYLIST_PATH = TwitchVideosBaseIE._PLAYLIST_PATH + 'upload'
+ _PLAYLIST_TYPE = 'uploads'
+
+ _TEST = {
+ 'url': 'https://www.twitch.tv/spamfish/videos/uploads',
+ 'info_dict': {
+ 'id': 'spamfish',
+ 'title': 'Spamfish',
+ },
+ 'playlist_mincount': 0,
+ }
+
+
+class TwitchPastBroadcastsIE(TwitchVideosBaseIE):
+ IE_NAME = 'twitch:videos:past-broadcasts'
+ _VALID_URL = r'%s/past-broadcasts' % TwitchVideosBaseIE._VALID_URL_VIDEOS_BASE
+ _PLAYLIST_PATH = TwitchVideosBaseIE._PLAYLIST_PATH + 'archive'
_PLAYLIST_TYPE = 'past broadcasts'
_TEST = {
- 'url': 'http://www.twitch.tv/spamfish/profile/past_broadcasts',
+ 'url': 'https://www.twitch.tv/spamfish/videos/past-broadcasts',
+ 'info_dict': {
+ 'id': 'spamfish',
+ 'title': 'Spamfish',
+ },
+ 'playlist_mincount': 0,
+ }
+
+
+class TwitchHighlightsIE(TwitchVideosBaseIE):
+ IE_NAME = 'twitch:videos:highlights'
+ _VALID_URL = r'%s/highlights' % TwitchVideosBaseIE._VALID_URL_VIDEOS_BASE
+ _PLAYLIST_PATH = TwitchVideosBaseIE._PLAYLIST_PATH + 'highlight'
+ _PLAYLIST_TYPE = 'highlights'
+
+ _TEST = {
+ 'url': 'https://www.twitch.tv/spamfish/videos/highlights',
'info_dict': {
'id': 'spamfish',
'title': 'Spamfish',
},
- 'playlist_mincount': 54,
+ 'playlist_mincount': 805,
}
class TwitchStreamIE(TwitchBaseIE):
IE_NAME = 'twitch:stream'
- _VALID_URL = r'%s/(?P<id>[^/#?]+)/?(?:\#.*)?$' % TwitchBaseIE._VALID_URL_BASE
+ _VALID_URL = r'''(?x)
+ https?://
+ (?:
+ (?:www\.)?twitch\.tv/|
+ player\.twitch\.tv/\?.*?\bchannel=
+ )
+ (?P<id>[^/#?]+)
+ '''
_TESTS = [{
'url': 'http://www.twitch.tv/shroomztv',
}, {
'url': 'http://www.twitch.tv/miracle_doto#profile-0',
'only_matching': True,
+ }, {
+ 'url': 'https://player.twitch.tv/?channel=lotsofs',
+ 'only_matching': True,
}]
+ @classmethod
+ def suitable(cls, url):
+ return (False
+ if any(ie.suitable(url) for ie in (
+ TwitchVideoIE,
+ TwitchChapterIE,
+ TwitchVodIE,
+ TwitchProfileIE,
+ TwitchAllVideosIE,
+ TwitchUploadsIE,
+ TwitchPastBroadcastsIE,
+ TwitchHighlightsIE))
+ else super(TwitchStreamIE, cls).suitable(url))
+
def _real_extract(self, url):
channel_id = self._match_id(url)
'id': 'AggressiveCobraPoooound',
'ext': 'mp4',
'title': 'EA Play 2016 Live from the Novo Theatre',
- 'thumbnail': 're:^https?://.*\.jpg',
+ 'thumbnail': r're:^https?://.*\.jpg',
'creator': 'EA',
'uploader': 'stereotype_',
'uploader_id': 'stereotype_',
'id': '560070183650213889',
'ext': 'mp4',
'title': 'Twitter Card',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'duration': 30.033,
}
},
'id': '623160978427936768',
'ext': 'mp4',
'title': 'Twitter Card',
- 'thumbnail': 're:^https?://.*\.jpg',
+ 'thumbnail': r're:^https?://.*\.jpg',
'duration': 80.155,
},
},
'id': '705235433198714880',
'ext': 'mp4',
'title': 'Twitter web player',
- 'thumbnail': 're:^https?://.*\.jpg',
+ 'thumbnail': r're:^https?://.*\.jpg',
},
}, {
'url': 'https://twitter.com/i/videos/752274308186120192',
'id': '643211948184596480',
'ext': 'mp4',
'title': 'FREE THE NIPPLE - FTN supporters on Hollywood Blvd today!',
- 'thumbnail': 're:^https?://.*\.jpg',
+ 'thumbnail': r're:^https?://.*\.jpg',
'description': 'FREE THE NIPPLE on Twitter: "FTN supporters on Hollywood Blvd today! http://t.co/c7jHH749xJ"',
'uploader': 'FREE THE NIPPLE',
'uploader_id': 'freethenipple',
'ext': 'mp4',
'title': 'Gifs - tu vai cai tu vai cai tu nao eh capaz disso tu vai cai',
'description': 'Gifs on Twitter: "tu vai cai tu vai cai tu nao eh capaz disso tu vai cai https://t.co/tM46VHFlO5"',
- 'thumbnail': 're:^https?://.*\.png',
+ 'thumbnail': r're:^https?://.*\.png',
'uploader': 'Gifs',
'uploader_id': 'giphz',
},
'ext': 'mp4',
'title': 'JG - BEAT PROD: @suhmeduh #Damndaniel',
'description': 'JG on Twitter: "BEAT PROD: @suhmeduh https://t.co/HBrQ4AfpvZ #Damndaniel https://t.co/byBooq2ejZ"',
- 'thumbnail': 're:^https?://.*\.jpg',
+ 'thumbnail': r're:^https?://.*\.jpg',
'uploader': 'JG',
'uploader_id': 'jaydingeer',
},
'id': '300040',
'ext': 'mp4',
'title': '生物老師男變女 全校挺"做自己"',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
},
'params': {
# m3u8 download
--- /dev/null
+# coding: utf-8
+from __future__ import unicode_literals
+
+from .common import InfoExtractor
+
+
+class UKTVPlayIE(InfoExtractor):
+ _VALID_URL = r'https?://uktvplay\.uktv\.co\.uk/.+?\?.*?\bvideo=(?P<id>\d+)'
+ _TEST = {
+ 'url': 'https://uktvplay.uktv.co.uk/shows/world-at-war/c/200/watch-online/?video=2117008346001',
+ 'md5': '',
+ 'info_dict': {
+ 'id': '2117008346001',
+ 'ext': 'mp4',
+ 'title': 'Pincers',
+ 'description': 'Pincers',
+ 'uploader_id': '1242911124001',
+ 'upload_date': '20130124',
+ 'timestamp': 1359049267,
+ },
+ 'params': {
+ # m3u8 download
+ 'skip_download': True,
+ },
+ 'expected_warnings': ['Failed to download MPD manifest']
+ }
+ BRIGHTCOVE_URL_TEMPLATE = 'http://players.brightcove.net/1242911124001/H1xnMOqP_default/index.html?videoId=%s'
+
+ def _real_extract(self, url):
+ video_id = self._match_id(url)
+ return self.url_result(
+ self.BRIGHTCOVE_URL_TEMPLATE % video_id,
+ 'BrightcoveNew', video_id)
def _real_extract(self, url):
video_id = self._match_id(url)
- if not video_id.isdigit():
- embed_page = self._download_webpage('https://jsuol.com.br/c/tv/uol/embed/?params=[embed,%s]' % video_id, video_id)
- video_id = self._search_regex(r'mediaId=(\d+)', embed_page, 'media id')
+ media_id = None
+
+ if video_id.isdigit():
+ media_id = video_id
+
+ if not media_id:
+ embed_page = self._download_webpage(
+ 'https://jsuol.com.br/c/tv/uol/embed/?params=[embed,%s]' % video_id,
+ video_id, 'Downloading embed page', fatal=False)
+ if embed_page:
+ media_id = self._search_regex(
+ (r'uol\.com\.br/(\d+)', r'mediaId=(\d+)'),
+ embed_page, 'media id', default=None)
+
+ if not media_id:
+ webpage = self._download_webpage(url, video_id)
+ media_id = self._search_regex(r'mediaId=(\d+)', webpage, 'media id')
+
video_data = self._download_json(
- 'http://mais.uol.com.br/apiuol/v3/player/getMedia/%s.json' % video_id,
- video_id)['item']
+ 'http://mais.uol.com.br/apiuol/v3/player/getMedia/%s.json' % media_id,
+ media_id)['item']
title = video_data['title']
query = {
tags.append(tag_description)
return {
- 'id': video_id,
+ 'id': media_id,
'title': title,
'description': clean_html(video_data.get('desMedia')),
'thumbnail': video_data.get('thumbnail'),
def _extract_uplynk_info(self, uplynk_content_url):
path, external_id, video_id, session_id = re.match(UplynkIE._VALID_URL, uplynk_content_url).groups()
display_id = video_id or external_id
- formats = self._extract_m3u8_formats('http://content.uplynk.com/%s.m3u8' % path, display_id, 'mp4')
+ formats = self._extract_m3u8_formats(
+ 'http://content.uplynk.com/%s.m3u8' % path,
+ display_id, 'mp4', 'm3u8_native')
if session_id:
for f in formats:
f['extra_param_to_segment_url'] = 'pbs=' + session_id
'id': '33124-24',
'ext': 'mp3',
'title': 'The Bomb',
- 'thumbnail': 're:^https?://.+\.jpg',
+ 'thumbnail': r're:^https?://.+\.jpg',
'uploader': 'Gerilja',
'uploader_id': 'Gerilja',
'upload_date': '20100323',
},
}]
+ @staticmethod
+ def _extract_url(webpage):
+ mobj = re.search(
+ r'<iframe[^>]+?src=(["\'])(?P<url>http://www\.ustream\.tv/embed/.+?)\1', webpage)
+ if mobj is not None:
+ return mobj.group('url')
+
def _get_stream_info(self, url, video_id, app_id_ver, extra_note=None):
def num_to_hex(n):
return hex(n)[2:]
'ext': 'mp4',
'title': 'San Francisco: Golden Gate Bridge',
'description': 'md5:23925500697f2c6d4830e387ba51a9be',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'upload_date': '20111107',
'uploader': 'Tony Farley',
}
'ext': 'mp4',
'title': '۵ واکنش برتر دروازهبانان؛هفته ۲۶ بوندسلیگا',
'description': 'فصل ۲۰۱۵-۲۰۱۴',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
},
'skip': 'HTTP 404 Error',
}, {
webpage, display_id, default=None)
if video_id is None:
video_id = self._search_regex(
- 'var\s+VideoId\s*=\s*(\d+);', webpage, 'video id',
+ r'var\s+VideoId\s*=\s*(\d+);', webpage, 'video id',
default=display_id)
return {
import re
from .common import InfoExtractor
-from ..utils import urlencode_postdata
+from ..utils import ExtractorError
class Vbox7IE(InfoExtractor):
- _VALID_URL = r'https?://(?:www\.)?vbox7\.com/(?:play:|emb/external\.php\?.*?\bvid=)(?P<id>[\da-fA-F]+)'
+ _VALID_URL = r'''(?x)
+ https?://
+ (?:[^/]+\.)?vbox7\.com/
+ (?:
+ play:|
+ (?:
+ emb/external\.php|
+ player/ext\.swf
+ )\?.*?\bvid=
+ )
+ (?P<id>[\da-fA-F]+)
+ '''
_TESTS = [{
'url': 'http://vbox7.com/play:0946fff23c',
'md5': 'a60f9ab3a3a2f013ef9a967d5f7be5bf',
'id': '0946fff23c',
'ext': 'mp4',
'title': 'Борисов: Притеснен съм за бъдещето на България',
+ 'description': 'По думите му е опасно страната ни да бъде обявена за "сигурна"',
+ 'thumbnail': r're:^https?://.*\.jpg$',
+ 'timestamp': 1470982814,
+ 'upload_date': '20160812',
+ 'uploader': 'zdraveibulgaria',
+ },
+ 'params': {
+ 'proxy': '127.0.0.1:8118',
},
}, {
'url': 'http://vbox7.com/play:249bb972c2',
}, {
'url': 'http://vbox7.com/emb/external.php?vid=a240d20f9c&autoplay=1',
'only_matching': True,
+ }, {
+ 'url': 'http://i49.vbox7.com/player/ext.swf?vid=0946fff23c&autoplay=1',
+ 'only_matching': True,
}]
@staticmethod
def _extract_url(webpage):
mobj = re.search(
- '<iframe[^>]+src=(?P<q>["\'])(?P<url>(?:https?:)?//vbox7\.com/emb/external\.php.+?)(?P=q)',
+ r'<iframe[^>]+src=(?P<q>["\'])(?P<url>(?:https?:)?//vbox7\.com/emb/external\.php.+?)(?P=q)',
webpage)
if mobj:
return mobj.group('url')
def _real_extract(self, url):
video_id = self._match_id(url)
- webpage = self._download_webpage(
- 'http://vbox7.com/play:%s' % video_id, video_id)
-
- title = self._html_search_regex(
- r'<title>(.+?)</title>', webpage, 'title').split('/')[0].strip()
+ response = self._download_json(
+ 'https://www.vbox7.com/ajax/video/nextvideo.php?vid=%s' % video_id,
+ video_id)
- video_url = self._search_regex(
- r'src\s*:\s*(["\'])(?P<url>.+?.mp4.*?)\1',
- webpage, 'video url', default=None, group='url')
+ if 'error' in response:
+ raise ExtractorError(
+ '%s said: %s' % (self.IE_NAME, response['error']), expected=True)
- thumbnail_url = self._og_search_thumbnail(webpage)
+ video = response['options']
- if not video_url:
- info_response = self._download_webpage(
- 'http://vbox7.com/play/magare.do', video_id,
- 'Downloading info webpage',
- data=urlencode_postdata({'as3': '1', 'vid': video_id}),
- headers={'Content-Type': 'application/x-www-form-urlencoded'})
- final_url, thumbnail_url = map(
- lambda x: x.split('=')[1], info_response.split('&'))
+ title = video['title']
+ video_url = video['src']
if '/na.mp4' in video_url:
self.raise_geo_restricted()
- return {
+ uploader = video.get('uploader')
+
+ webpage = self._download_webpage(
+ 'http://vbox7.com/play:%s' % video_id, video_id, fatal=None)
+
+ info = {}
+
+ if webpage:
+ info = self._search_json_ld(
+ webpage.replace('"/*@context"', '"@context"'), video_id,
+ fatal=False)
+
+ info.update({
'id': video_id,
- 'url': self._proto_relative_url(video_url, 'http:'),
'title': title,
- 'thumbnail': thumbnail_url,
- }
+ 'url': video_url,
+ 'uploader': uploader,
+ 'thumbnail': self._proto_relative_url(
+ info.get('thumbnail') or self._og_search_thumbnail(webpage),
+ 'http:'),
+ })
+ return info
'id': 'HDN7G5UMs',
'ext': 'mp4',
'title': 'Nvidia GeForce GTX Titan X - The Best Video Card on the Market?',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'upload_date': '20150317',
'description': 'Did Nvidia pull out all the stops on the Titan X, or does its performance leave something to be desired?',
'timestamp': int,
from .common import InfoExtractor
from ..compat import (
- compat_etree_fromstring,
compat_str,
compat_urlparse,
+ compat_HTTPError,
)
from ..utils import (
ExtractorError,
'url': 'http://www.vevo.com/watch/INS171400764',
'only_matching': True,
}]
- _SMIL_BASE_URL = 'http://smil.lvl3.vevo.com'
- _SOURCE_TYPES = {
- 0: 'youtube',
- 1: 'brightcove',
- 2: 'http',
- 3: 'hls_ios',
- 4: 'hls',
- 5: 'smil', # http
- 7: 'f4m_cc',
- 8: 'f4m_ak',
- 9: 'f4m_l3',
- 10: 'ism',
- 13: 'smil', # rtmp
- 18: 'dash',
- }
_VERSIONS = {
0: 'youtube', # only in AuthenticateVideo videoVersions
1: 'level3',
4: 'amazon',
}
- def _parse_smil_formats(self, smil, smil_url, video_id, namespace=None, f4m_params=None, transform_rtmp_url=None):
- formats = []
- els = smil.findall('.//{http://www.w3.org/2001/SMIL20/Language}video')
- for el in els:
- src = el.attrib['src']
- m = re.match(r'''(?xi)
- (?P<ext>[a-z0-9]+):
- (?P<path>
- [/a-z0-9]+ # The directory and main part of the URL
- _(?P<tbr>[0-9]+)k
- _(?P<width>[0-9]+)x(?P<height>[0-9]+)
- _(?P<vcodec>[a-z0-9]+)
- _(?P<vbr>[0-9]+)
- _(?P<acodec>[a-z0-9]+)
- _(?P<abr>[0-9]+)
- \.[a-z0-9]+ # File extension
- )''', src)
- if not m:
- continue
-
- format_url = self._SMIL_BASE_URL + m.group('path')
- formats.append({
- 'url': format_url,
- 'format_id': 'smil_' + m.group('tbr'),
- 'vcodec': m.group('vcodec'),
- 'acodec': m.group('acodec'),
- 'tbr': int(m.group('tbr')),
- 'vbr': int(m.group('vbr')),
- 'abr': int(m.group('abr')),
- 'ext': m.group('ext'),
- 'width': int(m.group('width')),
- 'height': int(m.group('height')),
- })
- return formats
-
def _initialize_api(self, video_id):
req = sanitized_Request(
'http://www.vevo.com/auth', data=b'')
note='Retrieving oauth token',
errnote='Unable to retrieve oauth token')
- if 'THIS PAGE IS CURRENTLY UNAVAILABLE IN YOUR REGION' in webpage:
+ if re.search(r'(?i)THIS PAGE IS CURRENTLY UNAVAILABLE IN YOUR REGION', webpage):
self.raise_geo_restricted(
'%s said: This page is currently unavailable in your region' % self.IE_NAME)
self._api_url_template = self.http_scheme() + '//apiv2.vevo.com/%s?token=' + auth_info['access_token']
def _call_api(self, path, *args, **kwargs):
- return self._download_json(self._api_url_template % path, *args, **kwargs)
+ try:
+ data = self._download_json(self._api_url_template % path, *args, **kwargs)
+ except ExtractorError as e:
+ if isinstance(e.cause, compat_HTTPError):
+ errors = self._parse_json(e.cause.read().decode(), None)['errors']
+ error_message = ', '.join([error['message'] for error in errors])
+ raise ExtractorError('%s said: %s' % (self.IE_NAME, error_message), expected=True)
+ raise
+ return data
def _real_extract(self, url):
video_id = self._match_id(url)
- json_url = 'http://api.vevo.com/VideoService/AuthenticateVideo?isrc=%s' % video_id
- response = self._download_json(
- json_url, video_id, 'Downloading video info',
- 'Unable to download info', fatal=False) or {}
- video_info = response.get('video') or {}
+ self._initialize_api(video_id)
+
+ video_info = self._call_api(
+ 'video/%s' % video_id, video_id, 'Downloading api video info',
+ 'Failed to download video info')
+
+ video_versions = self._call_api(
+ 'video/%s/streams' % video_id, video_id,
+ 'Downloading video versions info',
+ 'Failed to download video versions info',
+ fatal=False)
+
+ # Some videos are only available via webpage (e.g.
+ # https://github.com/rg3/youtube-dl/issues/9366)
+ if not video_versions:
+ webpage = self._download_webpage(url, video_id)
+ video_versions = self._extract_json(webpage, video_id, 'streams')[video_id][0]
+
+ uploader = None
artist = None
featured_artist = None
- uploader = None
- view_count = None
+ artists = video_info.get('artists')
+ for curr_artist in artists:
+ if curr_artist.get('role') == 'Featured':
+ featured_artist = curr_artist['name']
+ else:
+ artist = uploader = curr_artist['name']
+
formats = []
+ for video_version in video_versions:
+ version = self._VERSIONS.get(video_version['version'])
+ version_url = video_version.get('url')
+ if not version_url:
+ continue
- if not video_info:
- try:
- self._initialize_api(video_id)
- except ExtractorError:
- ytid = response.get('errorInfo', {}).get('ytid')
- if ytid:
- self.report_warning(
- 'Video is geoblocked, trying with the YouTube video %s' % ytid)
- return self.url_result(ytid, 'Youtube', ytid)
-
- raise
-
- video_info = self._call_api(
- 'video/%s' % video_id, video_id, 'Downloading api video info',
- 'Failed to download video info')
-
- video_versions = self._call_api(
- 'video/%s/streams' % video_id, video_id,
- 'Downloading video versions info',
- 'Failed to download video versions info',
- fatal=False)
-
- # Some videos are only available via webpage (e.g.
- # https://github.com/rg3/youtube-dl/issues/9366)
- if not video_versions:
- webpage = self._download_webpage(url, video_id)
- video_versions = self._extract_json(webpage, video_id, 'streams')[video_id][0]
-
- timestamp = parse_iso8601(video_info.get('releaseDate'))
- artists = video_info.get('artists')
- for curr_artist in artists:
- if curr_artist.get('role') == 'Featured':
- featured_artist = curr_artist['name']
- else:
- artist = uploader = curr_artist['name']
- view_count = int_or_none(video_info.get('views', {}).get('total'))
-
- for video_version in video_versions:
- version = self._VERSIONS.get(video_version['version'])
- version_url = video_version.get('url')
- if not version_url:
+ if '.ism' in version_url:
+ continue
+ elif '.mpd' in version_url:
+ formats.extend(self._extract_mpd_formats(
+ version_url, video_id, mpd_id='dash-%s' % version,
+ note='Downloading %s MPD information' % version,
+ errnote='Failed to download %s MPD information' % version,
+ fatal=False))
+ elif '.m3u8' in version_url:
+ formats.extend(self._extract_m3u8_formats(
+ version_url, video_id, 'mp4', 'm3u8_native',
+ m3u8_id='hls-%s' % version,
+ note='Downloading %s m3u8 information' % version,
+ errnote='Failed to download %s m3u8 information' % version,
+ fatal=False))
+ else:
+ m = re.search(r'''(?xi)
+ _(?P<width>[0-9]+)x(?P<height>[0-9]+)
+ _(?P<vcodec>[a-z0-9]+)
+ _(?P<vbr>[0-9]+)
+ _(?P<acodec>[a-z0-9]+)
+ _(?P<abr>[0-9]+)
+ \.(?P<ext>[a-z0-9]+)''', version_url)
+ if not m:
continue
- if '.ism' in version_url:
- continue
- elif '.mpd' in version_url:
- formats.extend(self._extract_mpd_formats(
- version_url, video_id, mpd_id='dash-%s' % version,
- note='Downloading %s MPD information' % version,
- errnote='Failed to download %s MPD information' % version,
- fatal=False))
- elif '.m3u8' in version_url:
- formats.extend(self._extract_m3u8_formats(
- version_url, video_id, 'mp4', 'm3u8_native',
- m3u8_id='hls-%s' % version,
- note='Downloading %s m3u8 information' % version,
- errnote='Failed to download %s m3u8 information' % version,
- fatal=False))
- else:
- m = re.search(r'''(?xi)
- _(?P<width>[0-9]+)x(?P<height>[0-9]+)
- _(?P<vcodec>[a-z0-9]+)
- _(?P<vbr>[0-9]+)
- _(?P<acodec>[a-z0-9]+)
- _(?P<abr>[0-9]+)
- \.(?P<ext>[a-z0-9]+)''', version_url)
- if not m:
- continue
-
- formats.append({
- 'url': version_url,
- 'format_id': 'http-%s-%s' % (version, video_version['quality']),
- 'vcodec': m.group('vcodec'),
- 'acodec': m.group('acodec'),
- 'vbr': int(m.group('vbr')),
- 'abr': int(m.group('abr')),
- 'ext': m.group('ext'),
- 'width': int(m.group('width')),
- 'height': int(m.group('height')),
- })
- else:
- timestamp = int_or_none(self._search_regex(
- r'/Date\((\d+)\)/',
- video_info['releaseDate'], 'release date', fatal=False),
- scale=1000)
- artists = video_info.get('mainArtists')
- if artists:
- artist = uploader = artists[0]['artistName']
-
- featured_artists = video_info.get('featuredArtists')
- if featured_artists:
- featured_artist = featured_artists[0]['artistName']
-
- smil_parsed = False
- for video_version in video_info['videoVersions']:
- version = self._VERSIONS.get(video_version['version'])
- if version == 'youtube':
- continue
- else:
- source_type = self._SOURCE_TYPES.get(video_version['sourceType'])
- renditions = compat_etree_fromstring(video_version['data'])
- if source_type == 'http':
- for rend in renditions.findall('rendition'):
- attr = rend.attrib
- formats.append({
- 'url': attr['url'],
- 'format_id': 'http-%s-%s' % (version, attr['name']),
- 'height': int_or_none(attr.get('frameheight')),
- 'width': int_or_none(attr.get('frameWidth')),
- 'tbr': int_or_none(attr.get('totalBitrate')),
- 'vbr': int_or_none(attr.get('videoBitrate')),
- 'abr': int_or_none(attr.get('audioBitrate')),
- 'vcodec': attr.get('videoCodec'),
- 'acodec': attr.get('audioCodec'),
- })
- elif source_type == 'hls':
- formats.extend(self._extract_m3u8_formats(
- renditions.find('rendition').attrib['url'], video_id,
- 'mp4', 'm3u8_native', m3u8_id='hls-%s' % version,
- note='Downloading %s m3u8 information' % version,
- errnote='Failed to download %s m3u8 information' % version,
- fatal=False))
- elif source_type == 'smil' and version == 'level3' and not smil_parsed:
- formats.extend(self._extract_smil_formats(
- renditions.find('rendition').attrib['url'], video_id, False))
- smil_parsed = True
+ formats.append({
+ 'url': version_url,
+ 'format_id': 'http-%s-%s' % (version, video_version['quality']),
+ 'vcodec': m.group('vcodec'),
+ 'acodec': m.group('acodec'),
+ 'vbr': int(m.group('vbr')),
+ 'abr': int(m.group('abr')),
+ 'ext': m.group('ext'),
+ 'width': int(m.group('width')),
+ 'height': int(m.group('height')),
+ })
self._sort_formats(formats)
track = video_info['title']
else:
age_limit = None
- duration = video_info.get('duration')
-
return {
'id': video_id,
'title': title,
'formats': formats,
'thumbnail': video_info.get('imageUrl') or video_info.get('thumbnailUrl'),
- 'timestamp': timestamp,
+ 'timestamp': parse_iso8601(video_info.get('releaseDate')),
'uploader': uploader,
- 'duration': duration,
- 'view_count': view_count,
+ 'duration': int_or_none(video_info.get('duration')),
+ 'view_count': int_or_none(video_info.get('views', {}).get('total')),
'age_limit': age_limit,
'track': track,
'artist': uploader,
'ext': 'mp4',
'title': 'Hevnen er søt: Episode 10 - Abu',
'description': 'md5:e25e4badb5f544b04341e14abdc72234',
- 'thumbnail': 're:^https?://.*\.jpg',
+ 'thumbnail': r're:^https?://.*\.jpg',
'duration': 648.000,
'timestamp': 1404626400,
'upload_date': '20140706',
'ext': 'flv',
'title': 'OPPTAK: VGTV følger EM-kvalifiseringen',
'description': 'md5:3772d9c0dc2dff92a886b60039a7d4d3',
- 'thumbnail': 're:^https?://.*\.jpg',
+ 'thumbnail': r're:^https?://.*\.jpg',
'duration': 9103.0,
'timestamp': 1410113864,
'upload_date': '20140907',
'ext': 'mp4',
'title': 'V75 fra Solvalla 30.05.15',
'description': 'md5:b3743425765355855f88e096acc93231',
- 'thumbnail': 're:^https?://.*\.jpg',
+ 'thumbnail': r're:^https?://.*\.jpg',
'duration': 25966,
'timestamp': 1432975582,
'upload_date': '20150530',
format_info = {
'url': mp4_url,
}
- mobj = re.search('(\d+)_(\d+)_(\d+)', mp4_url)
+ mobj = re.search(r'(\d+)_(\d+)_(\d+)', mp4_url)
if mobj:
tbr = int(mobj.group(3))
format_info.update({
'ext': 'mp4',
'title': 'Alrekstad internat',
'description': 'md5:dc81a9056c874fedb62fc48a300dac58',
- 'thumbnail': 're:^https?://.*\.jpg',
+ 'thumbnail': r're:^https?://.*\.jpg',
'duration': 191,
'timestamp': 1289991323,
'upload_date': '20101117',
'ext': 'mp4',
'title': 'Intro to VidBit',
'description': 'md5:5e0d6142eec00b766cbf114bfd3d16b7',
- 'thumbnail': 're:https?://.*\.jpg$',
+ 'thumbnail': r're:https?://.*\.jpg$',
'upload_date': '20160618',
'view_count': int,
'comment_count': int,
'timestamp': 1335371429,
'upload_date': '20120425',
'duration': 100.89,
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'view_count': int,
'comment_count': int,
'categories': ['video content', 'high quality video', 'video made easy', 'how to produce video with limited resources', 'viddler'],
--- /dev/null
+# coding: utf-8
+from __future__ import unicode_literals
+
+import re
+
+from .common import InfoExtractor
+from ..utils import (
+ int_or_none,
+ mimetype2ext,
+ parse_codecs,
+ xpath_element,
+ xpath_text,
+)
+
+
+class VideaIE(InfoExtractor):
+ _VALID_URL = r'''(?x)
+ https?://
+ videa\.hu/
+ (?:
+ videok/(?:[^/]+/)*[^?#&]+-|
+ player\?.*?\bv=|
+ player/v/
+ )
+ (?P<id>[^?#&]+)
+ '''
+ _TESTS = [{
+ 'url': 'http://videa.hu/videok/allatok/az-orult-kigyasz-285-kigyot-kigyo-8YfIAjxwWGwT8HVQ',
+ 'md5': '97a7af41faeaffd9f1fc864a7c7e7603',
+ 'info_dict': {
+ 'id': '8YfIAjxwWGwT8HVQ',
+ 'ext': 'mp4',
+ 'title': 'Az őrült kígyász 285 kígyót enged szabadon',
+ 'thumbnail': 'http://videa.hu/static/still/1.4.1.1007274.1204470.3',
+ 'duration': 21,
+ },
+ }, {
+ 'url': 'http://videa.hu/videok/origo/jarmuvek/supercars-elozes-jAHDWfWSJH5XuFhH',
+ 'only_matching': True,
+ }, {
+ 'url': 'http://videa.hu/player?v=8YfIAjxwWGwT8HVQ',
+ 'only_matching': True,
+ }, {
+ 'url': 'http://videa.hu/player/v/8YfIAjxwWGwT8HVQ?autoplay=1',
+ 'only_matching': True,
+ }]
+
+ @staticmethod
+ def _extract_urls(webpage):
+ return [url for _, url in re.findall(
+ r'<iframe[^>]+src=(["\'])(?P<url>(?:https?:)?//videa\.hu/player\?.*?\bv=.+?)\1',
+ webpage)]
+
+ def _real_extract(self, url):
+ video_id = self._match_id(url)
+
+ info = self._download_xml(
+ 'http://videa.hu/videaplayer_get_xml.php', video_id,
+ query={'v': video_id})
+
+ video = xpath_element(info, './/video', 'video', fatal=True)
+ sources = xpath_element(info, './/video_sources', 'sources', fatal=True)
+
+ title = xpath_text(video, './title', fatal=True)
+
+ formats = []
+ for source in sources.findall('./video_source'):
+ source_url = source.text
+ if not source_url:
+ continue
+ f = parse_codecs(source.get('codecs'))
+ f.update({
+ 'url': source_url,
+ 'ext': mimetype2ext(source.get('mimetype')) or 'mp4',
+ 'format_id': source.get('name'),
+ 'width': int_or_none(source.get('width')),
+ 'height': int_or_none(source.get('height')),
+ })
+ formats.append(f)
+ self._sort_formats(formats)
+
+ thumbnail = xpath_text(video, './poster_src')
+ duration = int_or_none(xpath_text(video, './duration'))
+
+ age_limit = None
+ is_adult = xpath_text(video, './is_adult_content', default=None)
+ if is_adult:
+ age_limit = 18 if is_adult == '1' else 0
+
+ return {
+ 'id': video_id,
+ 'title': title,
+ 'thumbnail': thumbnail,
+ 'duration': duration,
+ 'age_limit': age_limit,
+ 'formats': formats,
+ }
'id': 'AOSQBJYKIDDIKYJBQSOA',
'ext': 'mp4',
'title': '1254207',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
}
}, {
'url': 'http://videomega.tv/cdn.php?ref=AOSQBJYKIDDIKYJBQSOA&width=1070&height=600',
'title': 'Кино в деталях 5 сезон В гостях Алексей Чумаков и Юлия Ковальчук',
'series': 'Кино в деталях',
'episode': 'В гостях Алексей Чумаков и Юлия Ковальчук',
- 'thumbnail': 're:^https?://.*\.jpg',
+ 'thumbnail': r're:^https?://.*\.jpg',
'duration': 2910,
'view_count': int,
'comment_count': int,
'title': 'Молодежка 2 сезон 40 серия',
'series': 'Молодежка',
'episode': '40 серия',
- 'thumbnail': 're:^https?://.*\.jpg',
+ 'thumbnail': r're:^https?://.*\.jpg',
'duration': 2809,
'view_count': int,
'comment_count': int,
'ext': 'flv',
'title': 'Промо Команда проиграла из-за Бакина?',
'episode': 'Команда проиграла из-за Бакина?',
- 'thumbnail': 're:^https?://.*\.jpg',
+ 'thumbnail': r're:^https?://.*\.jpg',
'duration': 29,
'age_limit': 16,
'view_count': int,
'ext': 'flv',
'title': 'Ёлки 3',
'description': '',
- 'thumbnail': 're:^https?://.*\.jpg',
+ 'thumbnail': r're:^https?://.*\.jpg',
'duration': 5579,
'age_limit': 6,
'view_count': int,
'ext': 'flv',
'title': '1 серия. Здравствуй, Аквавилль!',
'description': 'md5:c6003179538b5d353e7bcd5b1372b2d7',
- 'thumbnail': 're:^https?://.*\.jpg',
+ 'thumbnail': r're:^https?://.*\.jpg',
'duration': 754,
'age_limit': 6,
'view_count': int,
--- /dev/null
+# coding: utf-8
+from __future__ import unicode_literals
+
+import random
+import re
+
+from .common import InfoExtractor
+from ..compat import compat_str
+from ..utils import (
+ determine_ext,
+ float_or_none,
+ parse_age_limit,
+ qualities,
+ try_get,
+ unified_timestamp,
+ urljoin,
+)
+
+
+class VideoPressIE(InfoExtractor):
+ _VALID_URL = r'https?://videopress\.com/embed/(?P<id>[\da-zA-Z]+)'
+ _TESTS = [{
+ 'url': 'https://videopress.com/embed/kUJmAcSf',
+ 'md5': '706956a6c875873d51010921310e4bc6',
+ 'info_dict': {
+ 'id': 'kUJmAcSf',
+ 'ext': 'mp4',
+ 'title': 'VideoPress Demo',
+ 'thumbnail': r're:^https?://.*\.jpg',
+ 'duration': 634.6,
+ 'timestamp': 1434983935,
+ 'upload_date': '20150622',
+ 'age_limit': 0,
+ },
+ }, {
+ # 17+, requires birth_* params
+ 'url': 'https://videopress.com/embed/iH3gstfZ',
+ 'only_matching': True,
+ }]
+
+ @staticmethod
+ def _extract_urls(webpage):
+ return re.findall(
+ r'<iframe[^>]+src=["\']((?:https?://)?videopress\.com/embed/[\da-zA-Z]+)',
+ webpage)
+
+ def _real_extract(self, url):
+ video_id = self._match_id(url)
+
+ video = self._download_json(
+ 'https://public-api.wordpress.com/rest/v1.1/videos/%s' % video_id,
+ video_id, query={
+ 'birth_month': random.randint(1, 12),
+ 'birth_day': random.randint(1, 31),
+ 'birth_year': random.randint(1950, 1995),
+ })
+
+ title = video['title']
+
+ def base_url(scheme):
+ return try_get(
+ video, lambda x: x['file_url_base'][scheme], compat_str)
+
+ base_url = base_url('https') or base_url('http')
+
+ QUALITIES = ('std', 'dvd', 'hd')
+ quality = qualities(QUALITIES)
+
+ formats = []
+ for format_id, f in video['files'].items():
+ if not isinstance(f, dict):
+ continue
+ for ext, path in f.items():
+ if ext in ('mp4', 'ogg'):
+ formats.append({
+ 'url': urljoin(base_url, path),
+ 'format_id': '%s-%s' % (format_id, ext),
+ 'ext': determine_ext(path, ext),
+ 'quality': quality(format_id),
+ })
+ original_url = try_get(video, lambda x: x['original'], compat_str)
+ if original_url:
+ formats.append({
+ 'url': original_url,
+ 'format_id': 'original',
+ 'quality': len(QUALITIES),
+ })
+ self._sort_formats(formats)
+
+ return {
+ 'id': video_id,
+ 'title': title,
+ 'description': video.get('description'),
+ 'thumbnail': video.get('poster'),
+ 'duration': float_or_none(video.get('duration'), 1000),
+ 'timestamp': unified_timestamp(video.get('upload_date')),
+ 'age_limit': parse_age_limit(video.get('rating')),
+ 'formats': formats,
+ }
+++ /dev/null
-from __future__ import unicode_literals
-
-import re
-import base64
-
-from .common import InfoExtractor
-from ..utils import (
- unified_strdate,
- int_or_none,
-)
-
-
-class VideoTtIE(InfoExtractor):
- _WORKING = False
- ID_NAME = 'video.tt'
- IE_DESC = 'video.tt - Your True Tube'
- _VALID_URL = r'https?://(?:www\.)?video\.tt/(?:(?:video|embed)/|watch_video\.php\?v=)(?P<id>[\da-zA-Z]{9})'
-
- _TESTS = [{
- 'url': 'http://www.video.tt/watch_video.php?v=amd5YujV8',
- 'md5': 'b13aa9e2f267effb5d1094443dff65ba',
- 'info_dict': {
- 'id': 'amd5YujV8',
- 'ext': 'flv',
- 'title': 'Motivational video Change your mind in just 2.50 mins',
- 'description': '',
- 'upload_date': '20130827',
- 'uploader': 'joseph313',
- }
- }, {
- 'url': 'http://video.tt/embed/amd5YujV8',
- 'only_matching': True,
- }]
-
- def _real_extract(self, url):
- mobj = re.match(self._VALID_URL, url)
- video_id = mobj.group('id')
-
- settings = self._download_json(
- 'http://www.video.tt/player_control/settings.php?v=%s' % video_id, video_id,
- 'Downloading video JSON')['settings']
-
- video = settings['video_details']['video']
-
- formats = [
- {
- 'url': base64.b64decode(res['u'].encode('utf-8')).decode('utf-8'),
- 'ext': 'flv',
- 'format_id': res['l'],
- } for res in settings['res'] if res['u']
- ]
-
- return {
- 'id': video_id,
- 'title': video['title'],
- 'description': video['description'],
- 'thumbnail': settings['config']['thumbnail'],
- 'upload_date': unified_strdate(video['added']),
- 'uploader': video['owner'],
- 'view_count': int_or_none(video['view_count']),
- 'comment_count': None if video.get('comment_count') == '--' else int_or_none(video['comment_count']),
- 'like_count': int_or_none(video['liked']),
- 'dislike_count': int_or_none(video['disliked']),
- 'formats': formats,
- }
'ext': 'mp4',
'title': 'DJ_AMBRED - Booyah (Live 2015)',
'description': 'md5:27dc15f819b6a78a626490881adbadf8',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'duration': 149,
'like_count': int,
},
'ext': 'mp4',
'title': 'Fishing for piranha - the easy way',
'description': 'source: https://www.facebook.com/photo.php?v=312276045600871',
- 'thumbnail': 're:^https?://.*\.jpg',
+ 'thumbnail': r're:^https?://.*\.jpg',
'timestamp': 1406313244,
'upload_date': '20140725',
'age_limit': 0,
'id': 'Gc6M',
'ext': 'mp4',
'title': 'O Mere Dil ke chain - Arnav and Khushi VM',
- 'thumbnail': 're:^https?://.*\.jpg',
+ 'thumbnail': r're:^https?://.*\.jpg',
'timestamp': 1441211642,
'upload_date': '20150902',
'uploader': 'SunshineM',
'ext': 'mp4',
'title': 'The Carver',
'description': 'md5:e9c24870018ae8113be936645b93ba3c',
- 'thumbnail': 're:^https?://.*\.jpg',
+ 'thumbnail': r're:^https?://.*\.jpg',
'timestamp': 1433203629,
'upload_date': '20150602',
'uploader': 'Thomas',
'id': 'Wmur',
'ext': 'mp4',
'title': 'naked smoking & stretching',
- 'thumbnail': 're:^https?://.*\.jpg',
+ 'thumbnail': r're:^https?://.*\.jpg',
'timestamp': 1430931613,
'upload_date': '20150506',
'uploader': 'naked-yogi',
'id': 'e5g',
'ext': 'mp4',
'title': 'Video upload (e5g)',
- 'thumbnail': 're:^https?://.*\.jpg',
+ 'thumbnail': r're:^https?://.*\.jpg',
'timestamp': 1401480195,
'upload_date': '20140530',
'uploader': None,
class ViewLiftBaseIE(InfoExtractor):
- _DOMAINS_REGEX = '(?:snagfilms|snagxtreme|funnyforfree|kiddovid|winnersview|monumentalsportsnetwork|vayafilm)\.com|kesari\.tv'
+ _DOMAINS_REGEX = r'(?:snagfilms|snagxtreme|funnyforfree|kiddovid|winnersview|monumentalsportsnetwork|vayafilm)\.com|kesari\.tv'
class ViewLiftEmbedIE(ViewLiftBaseIE):
'ext': 'mp4',
'title': 'Lost for Life',
'description': 'md5:fbdacc8bb6b455e464aaf98bc02e1c82',
- 'thumbnail': 're:^https?://.*\.jpg',
+ 'thumbnail': r're:^https?://.*\.jpg',
'duration': 4489,
'categories': ['Documentary', 'Crime', 'Award Winning', 'Festivals']
}
'ext': 'mp4',
'title': 'India',
'description': 'md5:5c168c5a8f4719c146aad2e0dfac6f5f',
- 'thumbnail': 're:^https?://.*\.jpg',
+ 'thumbnail': r're:^https?://.*\.jpg',
'duration': 979,
'categories': ['Documentary', 'Sports', 'Politics']
}
snag = self._parse_json(
self._search_regex(
- 'Snag\.page\.data\s*=\s*(\[.+?\]);', webpage, 'snag'),
+ r'Snag\.page\.data\s*=\s*(\[.+?\]);', webpage, 'snag'),
display_id)
for item in snag:
formats.extend(m3u8_formats)
else:
qualities_basename = self._search_regex(
- '/([^/]+)\.csmil/',
+ r'/([^/]+)\.csmil/',
manifest_url, 'qualities basename', default=None)
if not qualities_basename:
continue
'ext': 'mp4',
'title': 'Automatics, robotics and biocybernetics',
'description': 'md5:815fc1deb6b3a2bff99de2d5325be482',
- 'thumbnail': 're:http://.*\.jpg',
+ 'thumbnail': r're:http://.*\.jpg',
'timestamp': 1372349289,
'upload_date': '20130627',
'duration': 565,
'ext': 'flv',
'title': 'NLP at Google',
'description': 'md5:fc7a6d9bf0302d7cc0e53f7ca23747b3',
- 'thumbnail': 're:http://.*\.jpg',
+ 'thumbnail': r're:http://.*\.jpg',
'timestamp': 1284375600,
'upload_date': '20100913',
'duration': 5352,
'id': '23181',
'title': 'Deep Learning Summer School, Montreal 2015',
'description': 'md5:0533a85e4bd918df52a01f0e1ebe87b7',
- 'thumbnail': 're:http://.*\.jpg',
+ 'thumbnail': r're:http://.*\.jpg',
'timestamp': 1438560000,
},
'playlist_count': 30,
'id': '9737',
'display_id': 'mlss09uk_bishop_ibi',
'title': 'Introduction To Bayesian Inference',
- 'thumbnail': 're:http://.*\.jpg',
+ 'thumbnail': r're:http://.*\.jpg',
'timestamp': 1251622800,
},
'playlist': [{
'display_id': 'mlss09uk_bishop_ibi_part1',
'ext': 'wmv',
'title': 'Introduction To Bayesian Inference (Part 1)',
- 'thumbnail': 're:http://.*\.jpg',
+ 'thumbnail': r're:http://.*\.jpg',
'duration': 4622,
'timestamp': 1251622800,
'upload_date': '20090830',
'display_id': 'mlss09uk_bishop_ibi_part2',
'ext': 'wmv',
'title': 'Introduction To Bayesian Inference (Part 2)',
- 'thumbnail': 're:http://.*\.jpg',
+ 'thumbnail': r're:http://.*\.jpg',
'duration': 5641,
'timestamp': 1251622800,
'upload_date': '20090830',
sanitized_Request,
smuggle_url,
std_headers,
- unified_strdate,
+ try_get,
+ unified_timestamp,
unsmuggle_url,
urlencode_postdata,
unescapeHTML,
parse_filesize,
- try_get,
)
def _vimeo_sort_formats(self, formats):
# Bitrates are completely broken. Single m3u8 may contain entries in kbps and bps
# at the same time without actual units specified. This lead to wrong sorting.
- self._sort_formats(formats, field_preference=('preference', 'height', 'width', 'fps', 'format_id'))
+ self._sort_formats(formats, field_preference=('preference', 'height', 'width', 'fps', 'tbr', 'format_id'))
def _parse_config(self, config, video_id):
+ video_data = config['video']
# Extract title
- video_title = config['video']['title']
+ video_title = video_data['title']
# Extract uploader, uploader_url and uploader_id
- video_uploader = config['video'].get('owner', {}).get('name')
- video_uploader_url = config['video'].get('owner', {}).get('url')
+ video_uploader = video_data.get('owner', {}).get('name')
+ video_uploader_url = video_data.get('owner', {}).get('url')
video_uploader_id = video_uploader_url.split('/')[-1] if video_uploader_url else None
# Extract video thumbnail
- video_thumbnail = config['video'].get('thumbnail')
+ video_thumbnail = video_data.get('thumbnail')
if video_thumbnail is None:
- video_thumbs = config['video'].get('thumbs')
+ video_thumbs = video_data.get('thumbs')
if video_thumbs and isinstance(video_thumbs, dict):
_, video_thumbnail = sorted((int(width if width.isdigit() else 0), t_url) for (width, t_url) in video_thumbs.items())[-1]
# Extract video duration
- video_duration = int_or_none(config['video'].get('duration'))
+ video_duration = int_or_none(video_data.get('duration'))
formats = []
- config_files = config['video'].get('files') or config['request'].get('files', {})
+ config_files = video_data.get('files') or config['request'].get('files', {})
for f in config_files.get('progressive', []):
video_url = f.get('url')
if not video_url:
'fps': int_or_none(f.get('fps')),
'tbr': int_or_none(f.get('bitrate')),
})
- m3u8_url = config_files.get('hls', {}).get('url')
- if m3u8_url:
- formats.extend(self._extract_m3u8_formats(
- m3u8_url, video_id, 'mp4', 'm3u8_native', m3u8_id='hls', fatal=False))
+
+ for files_type in ('hls', 'dash'):
+ for cdn_name, cdn_data in config_files.get(files_type, {}).get('cdns', {}).items():
+ manifest_url = cdn_data.get('url')
+ if not manifest_url:
+ continue
+ format_id = '%s-%s' % (files_type, cdn_name)
+ if files_type == 'hls':
+ formats.extend(self._extract_m3u8_formats(
+ manifest_url, video_id, 'mp4',
+ 'm3u8_native', m3u8_id=format_id,
+ note='Downloading %s m3u8 information' % cdn_name,
+ fatal=False))
+ elif files_type == 'dash':
+ mpd_pattern = r'/%s/(?:sep/)?video/' % video_id
+ mpd_manifest_urls = []
+ if re.search(mpd_pattern, manifest_url):
+ for suffix, repl in (('', 'video'), ('_sep', 'sep/video')):
+ mpd_manifest_urls.append((format_id + suffix, re.sub(
+ mpd_pattern, '/%s/%s/' % (video_id, repl), manifest_url)))
+ else:
+ mpd_manifest_urls = [(format_id, manifest_url)]
+ for f_id, m_url in mpd_manifest_urls:
+ formats.extend(self._extract_mpd_formats(
+ m_url.replace('/master.json', '/master.mpd'), video_id, f_id,
+ 'Downloading %s MPD information' % cdn_name,
+ fatal=False))
subtitles = {}
text_tracks = config['request'].get('text_tracks')
'ext': 'mp4',
'title': "youtube-dl test video - \u2605 \" ' \u5e78 / \\ \u00e4 \u21ad \U0001d550",
'description': 'md5:2d3305bad981a06ff79f027f19865021',
+ 'timestamp': 1355990239,
'upload_date': '20121220',
- 'uploader_url': 're:https?://(?:www\.)?vimeo\.com/user7108434',
+ 'uploader_url': r're:https?://(?:www\.)?vimeo\.com/user7108434',
'uploader_id': 'user7108434',
'uploader': 'Filippo Valsorda',
'duration': 10,
+ 'license': 'by-sa',
},
},
{
'info_dict': {
'id': '68093876',
'ext': 'mp4',
- 'uploader_url': 're:https?://(?:www\.)?vimeo\.com/openstreetmapus',
+ 'uploader_url': r're:https?://(?:www\.)?vimeo\.com/openstreetmapus',
'uploader_id': 'openstreetmapus',
'uploader': 'OpenStreetMap US',
'title': 'Andy Allan - Putting the Carto into OpenStreetMap Cartography',
'ext': 'mp4',
'title': 'Kathy Sierra: Building the minimum Badass User, Business of Software 2012',
'uploader': 'The BLN & Business of Software',
- 'uploader_url': 're:https?://(?:www\.)?vimeo\.com/theblnbusinessofsoftware',
+ 'uploader_url': r're:https?://(?:www\.)?vimeo\.com/theblnbusinessofsoftware',
'uploader_id': 'theblnbusinessofsoftware',
'duration': 3610,
'description': None,
'id': '68375962',
'ext': 'mp4',
'title': 'youtube-dl password protected test video',
+ 'timestamp': 1371200155,
'upload_date': '20130614',
- 'uploader_url': 're:https?://(?:www\.)?vimeo\.com/user18948128',
+ 'uploader_url': r're:https?://(?:www\.)?vimeo\.com/user18948128',
'uploader_id': 'user18948128',
'uploader': 'Jaime Marquínez Ferrándiz',
'duration': 10,
- 'description': 'This is "youtube-dl password protected test video" by on Vimeo, the home for high quality videos and the people who love them.',
+ 'description': 'md5:dca3ea23adb29ee387127bc4ddfce63f',
},
'params': {
'videopassword': 'youtube-dl',
'ext': 'mp4',
'title': 'Key & Peele: Terrorist Interrogation',
'description': 'md5:8678b246399b070816b12313e8b4eb5c',
- 'uploader_url': 're:https?://(?:www\.)?vimeo\.com/atencio',
+ 'uploader_url': r're:https?://(?:www\.)?vimeo\.com/atencio',
'uploader_id': 'atencio',
'uploader': 'Peter Atencio',
- 'upload_date': '20130927',
+ 'timestamp': 1380339469,
+ 'upload_date': '20130928',
'duration': 187,
},
},
'ext': 'mp4',
'title': 'The New Vimeo Player (You Know, For Videos)',
'description': 'md5:2ec900bf97c3f389378a96aee11260ea',
+ 'timestamp': 1381846109,
'upload_date': '20131015',
- 'uploader_url': 're:https?://(?:www\.)?vimeo\.com/staff',
+ 'uploader_url': r're:https?://(?:www\.)?vimeo\.com/staff',
'uploader_id': 'staff',
'uploader': 'Vimeo Staff',
'duration': 62,
'ext': 'mp4',
'title': 'Pier Solar OUYA Official Trailer',
'uploader': 'Tulio Gonçalves',
- 'uploader_url': 're:https?://(?:www\.)?vimeo\.com/user28849593',
+ 'uploader_url': r're:https?://(?:www\.)?vimeo\.com/user28849593',
'uploader_id': 'user28849593',
},
},
{
# contains original format
'url': 'https://vimeo.com/33951933',
- 'md5': '2d9f5475e0537f013d0073e812ab89e6',
+ 'md5': '53c688fa95a55bf4b7293d37a89c5c53',
'info_dict': {
'id': '33951933',
'ext': 'mp4',
'title': 'FOX CLASSICS - Forever Classic ID - A Full Minute',
'uploader': 'The DMCI',
- 'uploader_url': 're:https?://(?:www\.)?vimeo\.com/dmci',
+ 'uploader_url': r're:https?://(?:www\.)?vimeo\.com/dmci',
'uploader_id': 'dmci',
+ 'timestamp': 1324343742,
'upload_date': '20111220',
'description': 'md5:ae23671e82d05415868f7ad1aec21147',
},
'url': 'https://vimeo.com/channels/tributes/6213729',
'info_dict': {
'id': '6213729',
- 'ext': 'mp4',
+ 'ext': 'mov',
'title': 'Vimeo Tribute: The Shining',
'uploader': 'Casey Donahue',
- 'uploader_url': 're:https?://(?:www\.)?vimeo\.com/caseydonahue',
+ 'uploader_url': r're:https?://(?:www\.)?vimeo\.com/caseydonahue',
'uploader_id': 'caseydonahue',
+ 'timestamp': 1250886430,
'upload_date': '20090821',
'description': 'md5:bdbf314014e58713e6e5b66eb252f4a6',
},
'expected_warnings': ['Unable to download JSON metadata'],
},
{
- # redirects to ondemand extractor and should be passed throught it
+ # redirects to ondemand extractor and should be passed through it
# for successful extraction
'url': 'https://vimeo.com/73445910',
'info_dict': {
'ext': 'mp4',
'title': 'The Reluctant Revolutionary',
'uploader': '10Ft Films',
- 'uploader_url': 're:https?://(?:www\.)?vimeo\.com/tenfootfilms',
+ 'uploader_url': r're:https?://(?:www\.)?vimeo\.com/tenfootfilms',
'uploader_id': 'tenfootfilms',
},
'params': {
'%s said: %s' % (self.IE_NAME, seed_status['title']),
expected=True)
+ cc_license = None
+ timestamp = None
+
# Extract the config JSON
try:
try:
vimeo_clip_page_config = self._search_regex(
r'vimeo\.clip_page_config\s*=\s*({.+?});', webpage,
'vimeo clip page config')
- config_url = self._parse_json(
- vimeo_clip_page_config, video_id)['player']['config_url']
+ page_config = self._parse_json(vimeo_clip_page_config, video_id)
+ config_url = page_config['player']['config_url']
+ cc_license = page_config.get('cc_license')
+ timestamp = try_get(
+ page_config, lambda x: x['clip']['uploaded_on'],
+ compat_str)
config_json = self._download_webpage(config_url, video_id)
config = json.loads(config_json)
except RegexNotFoundError:
# For pro videos or player.vimeo.com urls
# We try to find out to which variable is assigned the config dic
- m_variable_name = re.search('(\w)\.video\.id', webpage)
+ m_variable_name = re.search(r'(\w)\.video\.id', webpage)
if m_variable_name is not None:
config_re = r'%s=({[^}].+?});' % re.escape(m_variable_name.group(1))
else:
self._downloader.report_warning('Cannot find video description')
# Extract upload date
- video_upload_date = None
- mobj = re.search(r'<time[^>]+datetime="([^"]+)"', webpage)
- if mobj is not None:
- video_upload_date = unified_strdate(mobj.group(1))
+ if not timestamp:
+ timestamp = self._search_regex(
+ r'<time[^>]+datetime="([^"]+)"', webpage,
+ 'timestamp', default=None)
try:
view_count = int(self._search_regex(r'UserPlays:(\d+)', webpage, 'view count'))
info_dict = self._parse_config(config, video_id)
formats.extend(info_dict['formats'])
self._vimeo_sort_formats(formats)
+
+ if not cc_license:
+ cc_license = self._search_regex(
+ r'<link[^>]+rel=["\']license["\'][^>]+href=(["\'])(?P<license>(?:(?!\1).)+)\1',
+ webpage, 'license', default=None, group='license')
+
info_dict.update({
'id': video_id,
'formats': formats,
- 'upload_date': video_upload_date,
+ 'timestamp': unified_timestamp(timestamp),
'description': video_description,
'webpage_url': url,
'view_count': view_count,
'like_count': like_count,
'comment_count': comment_count,
+ 'license': cc_license,
})
return info_dict
'ext': 'mp4',
'title': 'המעבדה - במאי יותם פלדמן',
'uploader': 'גם סרטים',
- 'uploader_url': 're:https?://(?:www\.)?vimeo\.com/gumfilms',
+ 'uploader_url': r're:https?://(?:www\.)?vimeo\.com/gumfilms',
'uploader_id': 'gumfilms',
},
+ 'params': {
+ 'format': 'best[protocol=https]',
+ },
}, {
# requires Referer to be passed along with og:video:url
'url': 'https://vimeo.com/ondemand/36938/126682985',
'ext': 'mp4',
'title': 'Rävlock, rätt läte på rätt plats',
'uploader': 'Lindroth & Norin',
- 'uploader_url': 're:https?://(?:www\.)?vimeo\.com/user14430847',
+ 'uploader_url': r're:https?://(?:www\.)?vimeo\.com/user14430847',
'uploader_id': 'user14430847',
},
'params': {
# Try extracting href first since not all videos are available via
# short https://vimeo.com/id URL (e.g. https://vimeo.com/channels/tributes/6213729)
clips = re.findall(
- r'id="clip_(\d+)"[^>]*>\s*<a[^>]+href="(/(?:[^/]+/)*\1)', webpage)
+ r'id="clip_(\d+)"[^>]*>\s*<a[^>]+href="(/(?:[^/]+/)*\1)(?:[^>]+\btitle="([^"]+)")?', webpage)
if clips:
- for video_id, video_url in clips:
+ for video_id, video_url, video_title in clips:
yield self.url_result(
compat_urlparse.urljoin(base_url, video_url),
- VimeoIE.ie_key(), video_id=video_id)
+ VimeoIE.ie_key(), video_id=video_id, video_title=video_title)
# More relaxed fallback
else:
for video_id in re.findall(r'id=["\']clip_(\d+)', webpage):
'title': 're:(?i)^Death by dogma versus assembling agile . Sander Hoogendoorn',
'uploader': 'DevWeek Events',
'duration': 2773,
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'uploader_id': 'user22258446',
}
}, {
def _get_config_url(self, webpage_url, video_id, video_password_verified=False):
webpage = self._download_webpage(webpage_url, video_id)
- data = self._parse_json(self._search_regex(
- r'window\s*=\s*_extend\(window,\s*({.+?})\);', webpage, 'data',
- default=NO_DEFAULT if video_password_verified else '{}'), video_id)
- config_url = data.get('vimeo_esi', {}).get('config', {}).get('configUrl')
+ config_url = self._html_search_regex(
+ r'data-config-url=(["\'])(?P<url>(?:(?!\1).)+)\1', webpage,
+ 'config URL', default=None, group='url')
+ if not config_url:
+ data = self._parse_json(self._search_regex(
+ r'window\s*=\s*_extend\(window,\s*({.+?})\);', webpage, 'data',
+ default=NO_DEFAULT if video_password_verified else '{}'), video_id)
+ config_url = data.get('vimeo_esi', {}).get('config', {}).get('configUrl')
if config_url is None:
self._verify_video_password(webpage_url, video_id, webpage)
config_url = self._get_config_url(
'ext': 'mp4',
'title': 'Sunset',
'duration': 20,
- 'thumbnail': 're:https?://.*?\.jpg',
+ 'thumbnail': r're:https?://.*?\.jpg',
},
}, {
'url': 'http://player.vimple.ru/iframe/52e1beec-1314-4a83-aeac-c61562eadbf9',
from .common import InfoExtractor
from ..utils import (
+ determine_ext,
int_or_none,
- unified_strdate,
+ unified_timestamp,
)
'id': 'b9KOOWX7HUx',
'ext': 'mp4',
'title': 'Chicken.',
- 'alt_title': 'Vine by Jack Dorsey',
+ 'alt_title': 'Vine by Jack',
+ 'timestamp': 1368997951,
'upload_date': '20130519',
- 'uploader': 'Jack Dorsey',
+ 'uploader': 'Jack',
'uploader_id': '76',
'view_count': int,
'like_count': int,
'comment_count': int,
'repost_count': int,
},
- }, {
- 'url': 'https://vine.co/v/MYxVapFvz2z',
- 'md5': '7b9a7cbc76734424ff942eb52c8f1065',
- 'info_dict': {
- 'id': 'MYxVapFvz2z',
- 'ext': 'mp4',
- 'title': 'Fuck Da Police #Mikebrown #justice #ferguson #prayforferguson #protesting #NMOS14',
- 'alt_title': 'Vine by Mars Ruiz',
- 'upload_date': '20140815',
- 'uploader': 'Mars Ruiz',
- 'uploader_id': '1102363502380728320',
- 'view_count': int,
- 'like_count': int,
- 'comment_count': int,
- 'repost_count': int,
- },
- }, {
- 'url': 'https://vine.co/v/bxVjBbZlPUH',
- 'md5': 'ea27decea3fa670625aac92771a96b73',
- 'info_dict': {
- 'id': 'bxVjBbZlPUH',
- 'ext': 'mp4',
- 'title': '#mw3 #ac130 #killcam #angelofdeath',
- 'alt_title': 'Vine by Z3k3',
- 'upload_date': '20130430',
- 'uploader': 'Z3k3',
- 'uploader_id': '936470460173008896',
- 'view_count': int,
- 'like_count': int,
- 'comment_count': int,
- 'repost_count': int,
- },
- }, {
- 'url': 'https://vine.co/oembed/MYxVapFvz2z.json',
- 'only_matching': True,
}, {
'url': 'https://vine.co/v/e192BnZnZ9V',
'info_dict': {
'ext': 'mp4',
'title': 'ยิ้ม~ เขิน~ อาย~ น่าร้ากอ้ะ >//< @n_whitewo @orlameena #lovesicktheseries #lovesickseason2',
'alt_title': 'Vine by Pimry_zaa',
+ 'timestamp': 1436057405,
'upload_date': '20150705',
'uploader': 'Pimry_zaa',
'uploader_id': '1135760698325307392',
'params': {
'skip_download': True,
},
+ }, {
+ 'url': 'https://vine.co/v/MYxVapFvz2z',
+ 'only_matching': True,
+ }, {
+ 'url': 'https://vine.co/v/bxVjBbZlPUH',
+ 'only_matching': True,
+ }, {
+ 'url': 'https://vine.co/oembed/MYxVapFvz2z.json',
+ 'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
- webpage = self._download_webpage('https://vine.co/v/' + video_id, video_id)
-
- data = self._parse_json(
- self._search_regex(
- r'window\.POST_DATA\s*=\s*({.+?});\s*</script>',
- webpage, 'vine data'),
- video_id)
-
- data = data[list(data.keys())[0]]
-
- formats = [{
- 'format_id': '%(format)s-%(rate)s' % f,
- 'vcodec': f.get('format'),
- 'quality': f.get('rate'),
- 'url': f['videoUrl'],
- } for f in data['videoUrls'] if f.get('videoUrl')]
+ data = self._download_json(
+ 'https://archive.vine.co/posts/%s.json' % video_id, video_id)
+
+ def video_url(kind):
+ for url_suffix in ('Url', 'URL'):
+ format_url = data.get('video%s%s' % (kind, url_suffix))
+ if format_url:
+ return format_url
+
+ formats = []
+ for quality, format_id in enumerate(('low', '', 'dash')):
+ format_url = video_url(format_id.capitalize())
+ if not format_url:
+ continue
+ # DASH link returns plain mp4
+ if format_id == 'dash' and determine_ext(format_url) == 'mpd':
+ formats.extend(self._extract_mpd_formats(
+ format_url, video_id, mpd_id='dash', fatal=False))
+ else:
+ formats.append({
+ 'url': format_url,
+ 'format_id': format_id or 'standard',
+ 'quality': quality,
+ })
self._sort_formats(formats)
username = data.get('username')
return {
'id': video_id,
- 'title': data.get('description') or self._og_search_title(webpage),
- 'alt_title': 'Vine by %s' % username if username else self._og_search_description(webpage, default=None),
+ 'title': data.get('description'),
+ 'alt_title': 'Vine by %s' % username if username else None,
'thumbnail': data.get('thumbnailUrl'),
- 'upload_date': unified_strdate(data.get('created')),
+ 'timestamp': unified_timestamp(data.get('created')),
'uploader': username,
'uploader_id': data.get('userIdStr'),
- 'view_count': int_or_none(data.get('loops', {}).get('count')),
- 'like_count': int_or_none(data.get('likes', {}).get('count')),
- 'comment_count': int_or_none(data.get('comments', {}).get('count')),
- 'repost_count': int_or_none(data.get('reposts', {}).get('count')),
+ 'view_count': int_or_none(data.get('loops')),
+ 'like_count': int_or_none(data.get('likes')),
+ 'comment_count': int_or_none(data.get('comments')),
+ 'repost_count': int_or_none(data.get('reposts')),
'formats': formats,
}
--- /dev/null
+# coding: utf-8
+from __future__ import unicode_literals
+
+import re
+
+from .common import InfoExtractor
+from ..compat import compat_str
+from ..utils import (
+ ExtractorError,
+ int_or_none,
+)
+
+
+class ViuBaseIE(InfoExtractor):
+ def _real_initialize(self):
+ viu_auth_res = self._request_webpage(
+ 'https://www.viu.com/api/apps/v2/authenticate', None,
+ 'Requesting Viu auth', query={
+ 'acct': 'test',
+ 'appid': 'viu_desktop',
+ 'fmt': 'json',
+ 'iid': 'guest',
+ 'languageid': 'default',
+ 'platform': 'desktop',
+ 'userid': 'guest',
+ 'useridtype': 'guest',
+ 'ver': '1.0'
+ }, headers=self.geo_verification_headers())
+ self._auth_token = viu_auth_res.info()['X-VIU-AUTH']
+
+ def _call_api(self, path, *args, **kwargs):
+ headers = self.geo_verification_headers()
+ headers.update({
+ 'X-VIU-AUTH': self._auth_token
+ })
+ headers.update(kwargs.get('headers', {}))
+ kwargs['headers'] = headers
+ response = self._download_json(
+ 'https://www.viu.com/api/' + path, *args, **kwargs)['response']
+ if response.get('status') != 'success':
+ raise ExtractorError('%s said: %s' % (
+ self.IE_NAME, response['message']), expected=True)
+ return response
+
+
+class ViuIE(ViuBaseIE):
+ _VALID_URL = r'(?:viu:|https?://www\.viu\.com/[a-z]{2}/media/)(?P<id>\d+)'
+ _TESTS = [{
+ 'url': 'https://www.viu.com/en/media/1116705532?containerId=playlist-22168059',
+ 'info_dict': {
+ 'id': '1116705532',
+ 'ext': 'mp4',
+ 'title': 'Citizen Khan - Ep 1',
+ 'description': 'md5:d7ea1604f49e5ba79c212c551ce2110e',
+ },
+ 'params': {
+ 'skip_download': 'm3u8 download',
+ },
+ 'skip': 'Geo-restricted to India',
+ }, {
+ 'url': 'https://www.viu.com/en/media/1130599965',
+ 'info_dict': {
+ 'id': '1130599965',
+ 'ext': 'mp4',
+ 'title': 'Jealousy Incarnate - Episode 1',
+ 'description': 'md5:d3d82375cab969415d2720b6894361e9',
+ },
+ 'params': {
+ 'skip_download': 'm3u8 download',
+ },
+ 'skip': 'Geo-restricted to Indonesia',
+ }]
+
+ def _real_extract(self, url):
+ video_id = self._match_id(url)
+
+ video_data = self._call_api(
+ 'clip/load', video_id, 'Downloading video data', query={
+ 'appid': 'viu_desktop',
+ 'fmt': 'json',
+ 'id': video_id
+ })['item'][0]
+
+ title = video_data['title']
+
+ m3u8_url = None
+ url_path = video_data.get('urlpathd') or video_data.get('urlpath')
+ tdirforwhole = video_data.get('tdirforwhole')
+ # #EXT-X-BYTERANGE is not supported by native hls downloader
+ # and ffmpeg (#10955)
+ # hls_file = video_data.get('hlsfile')
+ hls_file = video_data.get('jwhlsfile')
+ if url_path and tdirforwhole and hls_file:
+ m3u8_url = '%s/%s/%s' % (url_path, tdirforwhole, hls_file)
+ else:
+ # m3u8_url = re.sub(
+ # r'(/hlsc_)[a-z]+(\d+\.m3u8)',
+ # r'\1whe\2', video_data['href'])
+ m3u8_url = video_data['href']
+ formats = self._extract_m3u8_formats(m3u8_url, video_id, 'mp4')
+ self._sort_formats(formats)
+
+ subtitles = {}
+ for key, value in video_data.items():
+ mobj = re.match(r'^subtitle_(?P<lang>[^_]+)_(?P<ext>(vtt|srt))', key)
+ if not mobj:
+ continue
+ subtitles.setdefault(mobj.group('lang'), []).append({
+ 'url': value,
+ 'ext': mobj.group('ext')
+ })
+
+ return {
+ 'id': video_id,
+ 'title': title,
+ 'description': video_data.get('description'),
+ 'series': video_data.get('moviealbumshowname'),
+ 'episode': title,
+ 'episode_number': int_or_none(video_data.get('episodeno')),
+ 'duration': int_or_none(video_data.get('duration')),
+ 'formats': formats,
+ 'subtitles': subtitles,
+ }
+
+
+class ViuPlaylistIE(ViuBaseIE):
+ IE_NAME = 'viu:playlist'
+ _VALID_URL = r'https?://www\.viu\.com/[^/]+/listing/playlist-(?P<id>\d+)'
+ _TEST = {
+ 'url': 'https://www.viu.com/en/listing/playlist-22461380',
+ 'info_dict': {
+ 'id': '22461380',
+ 'title': 'The Good Wife',
+ },
+ 'playlist_count': 16,
+ 'skip': 'Geo-restricted to Indonesia',
+ }
+
+ def _real_extract(self, url):
+ playlist_id = self._match_id(url)
+ playlist_data = self._call_api(
+ 'container/load', playlist_id,
+ 'Downloading playlist info', query={
+ 'appid': 'viu_desktop',
+ 'fmt': 'json',
+ 'id': 'playlist-' + playlist_id
+ })['container']
+
+ entries = []
+ for item in playlist_data.get('item', []):
+ item_id = item.get('id')
+ if not item_id:
+ continue
+ item_id = compat_str(item_id)
+ entries.append(self.url_result(
+ 'viu:' + item_id, 'Viu', item_id))
+
+ return self.playlist_result(
+ entries, playlist_id, playlist_data.get('title'))
+
+
+class ViuOTTIE(InfoExtractor):
+ IE_NAME = 'viu:ott'
+ _VALID_URL = r'https?://(?:www\.)?viu\.com/ott/(?P<country_code>[a-z]{2})/[a-z]{2}-[a-z]{2}/vod/(?P<id>\d+)'
+ _TESTS = [{
+ 'url': 'http://www.viu.com/ott/sg/en-us/vod/3421/The%20Prime%20Minister%20and%20I',
+ 'info_dict': {
+ 'id': '3421',
+ 'ext': 'mp4',
+ 'title': 'A New Beginning',
+ 'description': 'md5:1e7486a619b6399b25ba6a41c0fe5b2c',
+ },
+ 'params': {
+ 'skip_download': 'm3u8 download',
+ },
+ 'skip': 'Geo-restricted to Singapore',
+ }, {
+ 'url': 'http://www.viu.com/ott/hk/zh-hk/vod/7123/%E5%A4%A7%E4%BA%BA%E5%A5%B3%E5%AD%90',
+ 'info_dict': {
+ 'id': '7123',
+ 'ext': 'mp4',
+ 'title': '這就是我的生活之道',
+ 'description': 'md5:4eb0d8b08cf04fcdc6bbbeb16043434f',
+ },
+ 'params': {
+ 'skip_download': 'm3u8 download',
+ },
+ 'skip': 'Geo-restricted to Hong Kong',
+ }]
+
+ def _real_extract(self, url):
+ country_code, video_id = re.match(self._VALID_URL, url).groups()
+
+ product_data = self._download_json(
+ 'http://www.viu.com/ott/%s/index.php' % country_code, video_id,
+ 'Downloading video info', query={
+ 'r': 'vod/ajax-detail',
+ 'platform_flag_label': 'web',
+ 'product_id': video_id,
+ })['data']
+
+ video_data = product_data.get('current_product')
+ if not video_data:
+ raise ExtractorError('This video is not available in your region.', expected=True)
+
+ stream_data = self._download_json(
+ 'https://d1k2us671qcoau.cloudfront.net/distribute_web_%s.php' % country_code,
+ video_id, 'Downloading stream info', query={
+ 'ccs_product_id': video_data['ccs_product_id'],
+ })['data']['stream']
+
+ stream_sizes = stream_data.get('size', {})
+ formats = []
+ for vid_format, stream_url in stream_data.get('url', {}).items():
+ height = int_or_none(self._search_regex(
+ r's(\d+)p', vid_format, 'height', default=None))
+ formats.append({
+ 'format_id': vid_format,
+ 'url': stream_url,
+ 'height': height,
+ 'ext': 'mp4',
+ 'filesize': int_or_none(stream_sizes.get(vid_format))
+ })
+ self._sort_formats(formats)
+
+ subtitles = {}
+ for sub in video_data.get('subtitle', []):
+ sub_url = sub.get('url')
+ if not sub_url:
+ continue
+ subtitles.setdefault(sub.get('name'), []).append({
+ 'url': sub_url,
+ 'ext': 'srt',
+ })
+
+ title = video_data['synopsis'].strip()
+
+ return {
+ 'id': video_id,
+ 'title': title,
+ 'description': video_data.get('description'),
+ 'series': product_data.get('series', {}).get('name'),
+ 'episode': title,
+ 'episode_number': int_or_none(video_data.get('number')),
+ 'duration': int_or_none(stream_data.get('duration')),
+ 'thumbnail': video_data.get('cover_image_url'),
+ 'formats': formats,
+ 'subtitles': subtitles,
+ }
},
},
{
- # finished live stream, live_mp4
+ # finished live stream, postlive_mp4
'url': 'https://vk.com/videos-387766?z=video-387766_456242764%2Fpl_-387766_-2',
'md5': '90d22d051fccbbe9becfccc615be6791',
'info_dict': {
},
},
{
- # live stream, hls and rtmp links,most likely already finished live
+ # live stream, hls and rtmp links, most likely already finished live
# stream by the time you are reading this comment
'url': 'https://vk.com/video-140332_456239111',
'only_matching': True,
{
'url': 'http://new.vk.com/video205387401_165548505',
'only_matching': True,
+ },
+ {
+ # This video is no longer available, because its author has been blocked.
+ 'url': 'https://vk.com/video-10639516_456240611',
+ 'only_matching': True,
}
]
r'<!>Access denied':
'Access denied to video %s.',
+
+ r'<!>Видеозапись недоступна, так как её автор был заблокирован.':
+ 'Video %s is no longer available, because its author has been blocked.',
+
+ r'<!>This video is no longer available, because its author has been blocked.':
+ 'Video %s is no longer available, because its author has been blocked.',
}
for error_re, error_msg in ERRORS.items():
if not data:
data = self._parse_json(
self._search_regex(
- r'<!json>\s*({.+?})\s*<!>', info_page, 'json'),
- video_id)['player']['params'][0]
+ r'<!json>\s*({.+?})\s*<!>', info_page, 'json', default='{}'),
+ video_id)
+ if data:
+ data = data['player']['params'][0]
+
+ if not data:
+ data = self._parse_json(
+ self._search_regex(
+ r'var\s+playerParams\s*=\s*({.+?})\s*;\s*\n', info_page,
+ 'player params'),
+ video_id)['params'][0]
title = unescapeHTML(data['md_title'])
- if data.get('live') == 2:
+ # 2 = live
+ # 3 = post live (finished live)
+ is_live = data.get('live') == 2
+ if is_live:
title = self._live_title(title)
timestamp = unified_timestamp(self._html_search_regex(
for format_id, format_url in data.items():
if not isinstance(format_url, compat_str) or not format_url.startswith(('http', '//', 'rtmp')):
continue
- if format_id.startswith(('url', 'cache')) or format_id in ('extra_data', 'live_mp4'):
+ if (format_id.startswith(('url', 'cache')) or
+ format_id in ('extra_data', 'live_mp4', 'postlive_mp4')):
height = int_or_none(self._search_regex(
r'^(?:url|cache)(\d+)', format_id, 'height', default=None))
formats.append({
})
elif format_id == 'hls':
formats.extend(self._extract_m3u8_formats(
- format_url, video_id, 'mp4', m3u8_id=format_id,
- fatal=False, live=True))
+ format_url, video_id, 'mp4',
+ entry_protocol='m3u8' if is_live else 'm3u8_native',
+ m3u8_id=format_id, fatal=False, live=is_live))
elif format_id == 'rtmp':
formats.append({
'format_id': format_id,
'duration': data.get('duration'),
'timestamp': timestamp,
'view_count': view_count,
+ 'is_live': is_live,
}
from __future__ import unicode_literals
import re
+import time
+import itertools
from .common import InfoExtractor
+from ..compat import (
+ compat_urllib_parse_urlencode,
+ compat_str,
+)
from ..utils import (
dict_get,
ExtractorError,
float_or_none,
int_or_none,
remove_start,
+ try_get,
+ urlencode_postdata,
)
-from ..compat import compat_urllib_parse_urlencode
class VLiveIE(InfoExtractor):
webpage = self._download_webpage(
'http://www.vlive.tv/video/%s' % video_id, video_id)
- video_params = self._search_regex(
- r'\bvlive\.video\.init\(([^)]+)\)',
- webpage, 'video params')
- status, _, _, live_params, long_video_id, key = re.split(
- r'"\s*,\s*"', video_params)[2:8]
+ VIDEO_PARAMS_RE = r'\bvlive\.video\.init\(([^)]+)'
+ VIDEO_PARAMS_FIELD = 'video params'
+
+ params = self._parse_json(self._search_regex(
+ VIDEO_PARAMS_RE, webpage, VIDEO_PARAMS_FIELD, default=''), video_id,
+ transform_source=lambda s: '[' + s + ']', fatal=False)
+
+ if not params or len(params) < 7:
+ params = self._search_regex(
+ VIDEO_PARAMS_RE, webpage, VIDEO_PARAMS_FIELD)
+ params = [p.strip(r'"') for p in re.split(r'\s*,\s*', params)]
+
+ status, long_video_id, key = params[2], params[5], params[6]
status = remove_start(status, 'PRODUCT_')
if status == 'LIVE_ON_AIR' or status == 'BIG_EVENT_ON_AIR':
- live_params = self._parse_json('"%s"' % live_params, video_id)
- live_params = self._parse_json(live_params, video_id)
- return self._live(video_id, webpage, live_params)
+ return self._live(video_id, webpage)
elif status == 'VOD_ON_AIR' or status == 'BIG_EVENT_INTRO':
if long_video_id and key:
return self._replay(video_id, webpage, long_video_id, key)
'thumbnail': thumbnail,
}
- def _live(self, video_id, webpage, live_params):
+ def _live(self, video_id, webpage):
+ init_page = self._download_webpage(
+ 'http://www.vlive.tv/video/init/view',
+ video_id, note='Downloading live webpage',
+ data=urlencode_postdata({'videoSeq': video_id}),
+ headers={
+ 'Referer': 'http://www.vlive.tv/video/%s' % video_id,
+ 'Content-Type': 'application/x-www-form-urlencoded'
+ })
+
+ live_params = self._search_regex(
+ r'"liveStreamInfo"\s*:\s*(".*"),',
+ init_page, 'live stream info')
+ live_params = self._parse_json(live_params, video_id)
+ live_params = self._parse_json(live_params, video_id)
+
formats = []
for vid in live_params.get('resolutions', []):
formats.extend(self._extract_m3u8_formats(
fatal=False, live=True))
self._sort_formats(formats)
- return dict(self._get_common_fields(webpage),
- id=video_id,
- formats=formats,
- is_live=True)
+ info = self._get_common_fields(webpage)
+ info.update({
+ 'title': self._live_title(info['title']),
+ 'id': video_id,
+ 'formats': formats,
+ 'is_live': True,
+ })
+ return info
def _replay(self, video_id, webpage, long_video_id, key):
playinfo = self._download_json(
'ext': 'vtt',
'url': caption['source']}]
- return dict(self._get_common_fields(webpage),
- id=video_id,
- formats=formats,
- view_count=view_count,
- subtitles=subtitles)
+ info = self._get_common_fields(webpage)
+ info.update({
+ 'id': video_id,
+ 'formats': formats,
+ 'view_count': view_count,
+ 'subtitles': subtitles,
+ })
+ return info
+
+
+class VLiveChannelIE(InfoExtractor):
+ IE_NAME = 'vlive:channel'
+ _VALID_URL = r'https?://channels\.vlive\.tv/(?P<id>[0-9A-Z]+)'
+ _TEST = {
+ 'url': 'http://channels.vlive.tv/FCD4B',
+ 'info_dict': {
+ 'id': 'FCD4B',
+ 'title': 'MAMAMOO',
+ },
+ 'playlist_mincount': 110
+ }
+ _APP_ID = '8c6cc7b45d2568fb668be6e05b6e5a3b'
+
+ def _real_extract(self, url):
+ channel_code = self._match_id(url)
+
+ webpage = self._download_webpage(
+ 'http://channels.vlive.tv/%s/video' % channel_code, channel_code)
+
+ app_id = None
+
+ app_js_url = self._search_regex(
+ r'<script[^>]+src=(["\'])(?P<url>http.+?/app\.js.*?)\1',
+ webpage, 'app js', default=None, group='url')
+
+ if app_js_url:
+ app_js = self._download_webpage(
+ app_js_url, channel_code, 'Downloading app JS', fatal=False)
+ if app_js:
+ app_id = self._search_regex(
+ r'Global\.VFAN_APP_ID\s*=\s*[\'"]([^\'"]+)[\'"]',
+ app_js, 'app id', default=None)
+
+ app_id = app_id or self._APP_ID
+
+ channel_info = self._download_json(
+ 'http://api.vfan.vlive.tv/vproxy/channelplus/decodeChannelCode',
+ channel_code, note='Downloading decode channel code',
+ query={
+ 'app_id': app_id,
+ 'channelCode': channel_code,
+ '_': int(time.time())
+ })
+
+ channel_seq = channel_info['result']['channelSeq']
+ channel_name = None
+ entries = []
+
+ for page_num in itertools.count(1):
+ video_list = self._download_json(
+ 'http://api.vfan.vlive.tv/vproxy/channelplus/getChannelVideoList',
+ channel_code, note='Downloading channel list page #%d' % page_num,
+ query={
+ 'app_id': app_id,
+ 'channelSeq': channel_seq,
+ 'maxNumOfRows': 1000,
+ '_': int(time.time()),
+ 'pageNo': page_num
+ }
+ )
+
+ if not channel_name:
+ channel_name = try_get(
+ video_list,
+ lambda x: x['result']['channelInfo']['channelName'],
+ compat_str)
+
+ videos = try_get(
+ video_list, lambda x: x['result']['videoList'], list)
+ if not videos:
+ break
+
+ for video in videos:
+ video_id = video.get('videoSeq')
+ if not video_id:
+ continue
+ video_id = compat_str(video_id)
+ entries.append(
+ self.url_result(
+ 'http://www.vlive.tv/video/%s' % video_id,
+ ie=VLiveIE.ie_key(), video_id=video_id))
+
+ return self.playlist_result(
+ entries, channel_code, channel_name)
'id': 'e8wvyzz4sl42',
'ext': 'mp4',
'title': 'Germany vs Brazil',
- 'thumbnail': 're:http://.*\.jpg',
+ 'thumbnail': r're:http://.*\.jpg',
},
}]
'ext': 'm4a',
'title': 'Watching the Watchers: Building a Sousveillance State',
'description': 'Secret surveillance programs have metadata too. The people and companies that operate secret surveillance programs can be surveilled.',
- 'thumbnail': 're:^https?://.*\.(?:png|jpg)$',
+ 'thumbnail': r're:^https?://.*\.(?:png|jpg)$',
'duration': 1800,
'view_count': int,
}
ExtractorError,
parse_duration,
str_to_int,
+ urljoin,
)
'ext': 'mp4',
'title': 'Violet on her 19th birthday',
'description': 'Violet dances in front of the camera which is sure to get you horny.',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'uploader': 'kileyGrope',
'categories': ['Masturbation', 'Teen'],
'duration': 393,
'ext': 'mp4',
'title': 'Hana Shower',
'description': 'Hana showers at the bathroom.',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'uploader': 'Hmmmmm',
'categories': ['Big Boobs', 'Erotic', 'Teen', 'Female', '720p'],
'duration': 588,
description = self._html_search_regex(
r'class="(?:descr|description_txt)">(.*?)</div>',
webpage, 'description', fatal=False)
- thumbnail = self._html_search_regex(
- r'flashvars\.imageUrl\s*=\s*"([^"]+)"', webpage, 'description', fatal=False, default=None)
- if thumbnail:
- thumbnail = 'http://www.vporn.com' + thumbnail
+ thumbnail = urljoin('http://www.vporn.com', self._html_search_regex(
+ r'flashvars\.imageUrl\s*=\s*"([^"]+)"', webpage, 'description',
+ default=None))
uploader = self._html_search_regex(
r'(?s)Uploaded by:.*?<a href="/user/[^"]+"[^>]*>(.+?)</a>',
'ext': 'mp4',
'title': 'Best Drummer Ever [HD]',
'description': 'md5:2d63c4b277b85c2277761c2cf7337d71',
- 'thumbnail': 're:^https?://.*\.jpg',
+ 'thumbnail': r're:^https?://.*\.jpg',
'uploader': 'William',
'timestamp': 1406876915,
'upload_date': '20140801',
'ext': 'mp4',
'title': 'Chiara Grispo - Price Tag by Jessie J',
'description': 'md5:8ea652a1f36818352428cb5134933313',
- 'thumbnail': 're:^http://frame\.thestaticvube\.com/snap/[0-9x]+/102e7e63057-5ebc-4f5c-4065-6ce4ebde131f\.jpg$',
+ 'thumbnail': r're:^http://frame\.thestaticvube\.com/snap/[0-9x]+/102e7e63057-5ebc-4f5c-4065-6ce4ebde131f\.jpg$',
'uploader': 'Chiara.Grispo',
'timestamp': 1388743358,
'upload_date': '20140103',
'ext': 'mp4',
'title': 'My 7 year old Sister and I singing "Alive" by Krewella',
'description': 'md5:40bcacb97796339f1690642c21d56f4a',
- 'thumbnail': 're:^http://frame\.thestaticvube\.com/snap/[0-9x]+/102265d5a9f-0f17-4f6b-5753-adf08484ee1e\.jpg$',
+ 'thumbnail': r're:^http://frame\.thestaticvube\.com/snap/[0-9x]+/102265d5a9f-0f17-4f6b-5753-adf08484ee1e\.jpg$',
'uploader': 'Seraina',
'timestamp': 1396492438,
'upload_date': '20140403',
'ext': 'mp4',
'title': 'Frozen - Let It Go Cover by Siren Gene',
'description': 'My rendition of "Let It Go" originally sung by Idina Menzel.',
- 'thumbnail': 're:^http://frame\.thestaticvube\.com/snap/[0-9x]+/10283ab622a-86c9-4681-51f2-30d1f65774af\.jpg$',
+ 'thumbnail': r're:^http://frame\.thestaticvube\.com/snap/[0-9x]+/10283ab622a-86c9-4681-51f2-30d1f65774af\.jpg$',
'uploader': 'Siren',
'timestamp': 1395448018,
'upload_date': '20140322',
--- /dev/null
+# coding: utf-8
+from __future__ import unicode_literals
+
+import re
+
+from .common import InfoExtractor
+from ..utils import (
+ ExtractorError,
+ int_or_none,
+ str_or_none,
+)
+
+
+class VVVVIDIE(InfoExtractor):
+ _VALID_URL = r'https?://(?:www\.)?vvvvid\.it/#!(?:show|anime|film|series)/(?P<show_id>\d+)/[^/]+/(?P<season_id>\d+)/(?P<id>[0-9]+)'
+ _TESTS = [{
+ # video_type == 'video/vvvvid'
+ 'url': 'https://www.vvvvid.it/#!show/434/perche-dovrei-guardarlo-di-dario-moccia/437/489048/ping-pong',
+ 'md5': 'b8d3cecc2e981adc3835adf07f6df91b',
+ 'info_dict': {
+ 'id': '489048',
+ 'ext': 'mp4',
+ 'title': 'Ping Pong',
+ },
+ }, {
+ # video_type == 'video/rcs'
+ 'url': 'https://www.vvvvid.it/#!show/376/death-note-live-action/377/482493/episodio-01',
+ 'md5': '33e0edfba720ad73a8782157fdebc648',
+ 'info_dict': {
+ 'id': '482493',
+ 'ext': 'mp4',
+ 'title': 'Episodio 01',
+ },
+ }]
+ _conn_id = None
+
+ def _real_initialize(self):
+ self._conn_id = self._download_json(
+ 'https://www.vvvvid.it/user/login',
+ None, headers=self.geo_verification_headers())['data']['conn_id']
+
+ def _real_extract(self, url):
+ show_id, season_id, video_id = re.match(self._VALID_URL, url).groups()
+ response = self._download_json(
+ 'https://www.vvvvid.it/vvvvid/ondemand/%s/season/%s' % (show_id, season_id),
+ video_id, headers=self.geo_verification_headers(), query={
+ 'conn_id': self._conn_id,
+ })
+ if response['result'] == 'error':
+ raise ExtractorError('%s said: %s' % (
+ self.IE_NAME, response['message']), expected=True)
+
+ vid = int(video_id)
+ video_data = list(filter(
+ lambda episode: episode.get('video_id') == vid, response['data']))[0]
+ formats = []
+
+ # vvvvid embed_info decryption algorithm is reverse engineered from function $ds(h) at vvvvid.js
+ def ds(h):
+ g = "MNOPIJKL89+/4567UVWXQRSTEFGHABCDcdefYZabstuvopqr0123wxyzklmnghij"
+
+ def f(m):
+ l = []
+ o = 0
+ b = False
+ m_len = len(m)
+ while ((not b) and o < m_len):
+ n = m[o] << 2
+ o += 1
+ k = -1
+ j = -1
+ if o < m_len:
+ n += m[o] >> 4
+ o += 1
+ if o < m_len:
+ k = (m[o - 1] << 4) & 255
+ k += m[o] >> 2
+ o += 1
+ if o < m_len:
+ j = (m[o - 1] << 6) & 255
+ j += m[o]
+ o += 1
+ else:
+ b = True
+ else:
+ b = True
+ else:
+ b = True
+ l.append(n)
+ if k != -1:
+ l.append(k)
+ if j != -1:
+ l.append(j)
+ return l
+
+ c = []
+ for e in h:
+ c.append(g.index(e))
+
+ c_len = len(c)
+ for e in range(c_len * 2 - 1, -1, -1):
+ a = c[e % c_len] ^ c[(e + 1) % c_len]
+ c[e % c_len] = a
+
+ c = f(c)
+ d = ''
+ for e in c:
+ d += chr(e)
+
+ return d
+
+ for quality in ('_sd', ''):
+ embed_code = video_data.get('embed_info' + quality)
+ if not embed_code:
+ continue
+ embed_code = ds(embed_code)
+ video_type = video_data.get('video_type')
+ if video_type in ('video/rcs', 'video/kenc'):
+ formats.extend(self._extract_akamai_formats(
+ embed_code, video_id))
+ else:
+ formats.extend(self._extract_wowza_formats(
+ 'http://sb.top-ix.org/videomg/_definst_/mp4:%s/playlist.m3u8' % embed_code, video_id))
+ self._sort_formats(formats)
+
+ return {
+ 'id': video_id,
+ 'title': video_data['title'],
+ 'formats': formats,
+ 'thumbnail': video_data.get('thumbnail'),
+ 'duration': int_or_none(video_data.get('length')),
+ 'series': video_data.get('show_title'),
+ 'season_id': season_id,
+ 'season_number': video_data.get('season_number'),
+ 'episode_id': str_or_none(video_data.get('id')),
+ 'epidode_number': int_or_none(video_data.get('number')),
+ 'episode_title': video_data['title'],
+ 'view_count': int_or_none(video_data.get('views')),
+ 'like_count': int_or_none(video_data.get('video_likes')),
+ }
'ext': 'flv',
'title': 'וואן דיירקשן: ההיסטריה',
'description': 'md5:de9e2512a92442574cdb0913c49bc4d8',
- 'thumbnail': 're:^https?://.*\.jpg',
+ 'thumbnail': r're:^https?://.*\.jpg',
'duration': 3600,
},
'params': {
'display_id': 'hot-milf-from-kerala-shows-off-her-gorgeous-large-breasts-on-camera',
'ext': 'mp4',
'title': 'Hot milf from kerala shows off her gorgeous large breasts on camera',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'uploader': 'LoveJay',
'upload_date': '20160428',
'duration': 226,
'id': 'c8cefd240aa593681c8d068cff59f407_hd',
'ext': 'mp4',
'title': 'Сибирь - Нефтехимик. Лучшие моменты первого периода',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
},
}, {
'url': 'http://bl.webcaster.pro/media/start/free_6246c7a4453ac4c42b4398f840d13100_hd/2_2991109016/e8d0d82587ef435480118f9f9c41db41/4635726126',
'id': '4536',
'ext': 'mp4',
'title': 'The temperature of the sun',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'description': 'Hans Bethe talks about calculating the temperature of the sun',
'duration': 238,
}
'id': '55908',
'ext': 'mp4',
'title': 'The story of Gemmata obscuriglobus',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'description': 'Planctomycete talks about The story of Gemmata obscuriglobus',
'duration': 169,
},
'id': '54215',
'ext': 'mp4',
'title': '"A Leg to Stand On"',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'description': 'Oliver Sacks talks about the death and resurrection of a limb',
'duration': 97,
},
entries = [
self.url_result('http://www.webofstories.com/play/%s' % video_number, 'WebOfStories')
- for video_number in set(re.findall('href="/playAll/%s\?sId=(\d+)"' % playlist_id, webpage))
+ for video_number in set(re.findall(r'href="/playAll/%s\?sId=(\d+)"' % playlist_id, webpage))
]
title = self._search_regex(
page = self._download_webpage(url, media_id)
info_json_str = self._search_regex(
- 'var\s+video\s*=\s*(.+});', page, 'info json str')
+ r'var\s+video\s*=\s*(.+});', page, 'info json str')
info_json = self._parse_json(info_json_str, media_id)
letvcloud_url = self._search_regex(
- 'var\s+letvurl\s*=\s*"([^"]+)', page, 'letvcloud url')
+ r'var\s+letvurl\s*=\s*"([^"]+)', page, 'letvcloud url')
return {
'_type': 'url_transparent',
'ext': 'mp4',
'title': 'md5:7358a9faef8b7b57acda7c04816f170e',
'age_limit': 18,
- 'thumbnail': 're:^http://.*\.jpg',
+ 'thumbnail': r're:^http://.*\.jpg',
}
}
'id': '06y9juieqpmi',
'ext': 'mp4',
'title': 'Rebecca Black My Moment Official Music Video Reaction-6GK87Rc8bzQ',
- 'thumbnail': 're:http://.*\.jpg',
+ 'thumbnail': r're:http://.*\.jpg',
},
}, {
'url': 'http://gorillavid.in/embed-z08zf8le23c6-960x480.html',
'id': '3rso4kdn6f9m',
'ext': 'mp4',
'title': 'Micro Pig piglets ready on 16th July 2009-bG0PdrCdxUc',
- 'thumbnail': 're:http://.*\.jpg',
+ 'thumbnail': r're:http://.*\.jpg',
}
}, {
'url': 'http://movpod.in/0wguyyxi1yca',
'id': '3ivfabn7573c',
'ext': 'mp4',
'title': 'youtube-dl test video \'äBaW_jenozKc.mp4.mp4',
- 'thumbnail': 're:http://.*\.jpg',
+ 'thumbnail': r're:http://.*\.jpg',
},
'skip': 'Video removed',
}, {
from .common import InfoExtractor
from ..utils import (
dict_get,
- float_or_none,
int_or_none,
+ parse_duration,
unified_strdate,
)
'title': 'FemaleAgent Shy beauty takes the bait',
'upload_date': '20121014',
'uploader': 'Ruseful2011',
- 'duration': 893.52,
+ 'duration': 893,
'age_limit': 18,
},
}, {
'title': 'Britney Spears Sexy Booty',
'upload_date': '20130914',
'uploader': 'jojo747400',
- 'duration': 200.48,
+ 'duration': 200,
'age_limit': 18,
},
'params': {
'title': '....',
'upload_date': '20160208',
'uploader': 'parejafree',
- 'duration': 72.0,
+ 'duration': 72,
'age_limit': 18,
},
'params': {
r'''<video[^>]+poster=(?P<q>["'])(?P<thumbnail>.+?)(?P=q)[^>]*>'''],
webpage, 'thumbnail', fatal=False, group='thumbnail')
- duration = float_or_none(self._search_regex(
- r'(["\'])duration\1\s*:\s*(["\'])(?P<duration>.+?)\2',
- webpage, 'duration', fatal=False, group='duration'))
+ duration = parse_duration(self._search_regex(
+ r'Runtime:\s*</span>\s*([\d:]+)', webpage,
+ 'duration', fatal=False))
view_count = int_or_none(self._search_regex(
r'content=["\']User(?:View|Play)s:(\d+)',
return webpage
def _extract_track(self, track, track_id=None):
- title = track['title']
+ track_name = track.get('songName') or track.get('name') or track['subName']
+ artist = track.get('artist') or track.get('artist_name') or track.get('singers')
+ title = '%s - %s' % (artist, track_name) if artist else track_name
track_url = self._decrypt(track['location'])
subtitles = {}
'thumbnail': track.get('pic') or track.get('album_pic'),
'duration': int_or_none(track.get('length')),
'creator': track.get('artist', '').split(';')[0],
- 'track': title,
- 'album': track.get('album_name'),
- 'artist': track.get('artist'),
+ 'track': track_name,
+ 'track_number': int_or_none(track.get('track')),
+ 'album': track.get('album_name') or track.get('title'),
+ 'artist': artist,
'subtitles': subtitles,
}
class XiamiSongIE(XiamiBaseIE):
IE_NAME = 'xiami:song'
IE_DESC = '虾米音乐'
- _VALID_URL = r'https?://(?:www\.)?xiami\.com/song/(?P<id>[0-9]+)'
+ _VALID_URL = r'https?://(?:www\.)?xiami\.com/song/(?P<id>[^/?#&]+)'
_TESTS = [{
'url': 'http://www.xiami.com/song/1775610518',
'md5': '521dd6bea40fd5c9c69f913c232cb57e',
'info_dict': {
'id': '1775610518',
'ext': 'mp3',
- 'title': 'Woman',
+ 'title': 'HONNE - Woman',
'thumbnail': r're:http://img\.xiami\.net/images/album/.*\.jpg',
'duration': 265,
'creator': 'HONNE',
'info_dict': {
'id': '1775256504',
'ext': 'mp3',
- 'title': '悟空',
+ 'title': 'æ\88´è\8d\83 - æ\82\9f空',
'thumbnail': r're:http://img\.xiami\.net/images/album/.*\.jpg',
'duration': 200,
'creator': '戴荃',
},
},
'skip': 'Georestricted',
+ }, {
+ 'url': 'http://www.xiami.com/song/1775953850',
+ 'info_dict': {
+ 'id': '1775953850',
+ 'ext': 'mp3',
+ 'title': 'До Скону - Чума Пожирает Землю',
+ 'thumbnail': r're:http://img\.xiami\.net/images/album/.*\.jpg',
+ 'duration': 683,
+ 'creator': 'До Скону',
+ 'track': 'Чума Пожирает Землю',
+ 'track_number': 7,
+ 'album': 'Ад',
+ 'artist': 'До Скону',
+ },
+ 'params': {
+ 'skip_download': True,
+ },
+ }, {
+ 'url': 'http://www.xiami.com/song/xLHGwgd07a1',
+ 'only_matching': True,
}]
def _real_extract(self, url):
class XiamiAlbumIE(XiamiPlaylistBaseIE):
IE_NAME = 'xiami:album'
IE_DESC = '虾米音乐 - 专辑'
- _VALID_URL = r'https?://(?:www\.)?xiami\.com/album/(?P<id>[0-9]+)'
+ _VALID_URL = r'https?://(?:www\.)?xiami\.com/album/(?P<id>[^/?#&]+)'
_TYPE = '1'
_TESTS = [{
'url': 'http://www.xiami.com/album/2100300444',
}, {
'url': 'http://www.xiami.com/album/512288?spm=a1z1s.6843761.1110925389.6.hhE9p9',
'only_matching': True,
+ }, {
+ 'url': 'http://www.xiami.com/album/URVDji2a506',
+ 'only_matching': True,
}]
class XiamiArtistIE(XiamiPlaylistBaseIE):
IE_NAME = 'xiami:artist'
IE_DESC = '虾米音乐 - 歌手'
- _VALID_URL = r'https?://(?:www\.)?xiami\.com/artist/(?P<id>[0-9]+)'
+ _VALID_URL = r'https?://(?:www\.)?xiami\.com/artist/(?P<id>[^/?#&]+)'
_TYPE = '2'
- _TEST = {
+ _TESTS = [{
'url': 'http://www.xiami.com/artist/2132?spm=0.0.0.0.dKaScp',
'info_dict': {
'id': '2132',
},
'playlist_count': 20,
'skip': 'Georestricted',
- }
+ }, {
+ 'url': 'http://www.xiami.com/artist/bC5Tk2K6eb99',
+ 'only_matching': True,
+ }]
class XiamiCollectionIE(XiamiPlaylistBaseIE):
IE_NAME = 'xiami:collection'
IE_DESC = '虾米音乐 - 精选集'
- _VALID_URL = r'https?://(?:www\.)?xiami\.com/collect/(?P<id>[0-9]+)'
+ _VALID_URL = r'https?://(?:www\.)?xiami\.com/collect/(?P<id>[^/?#&]+)'
_TYPE = '3'
_TEST = {
'url': 'http://www.xiami.com/collect/156527391?spm=a1z1s.2943601.6856193.12.4jpBnr',
'id': '3860914',
'ext': 'mp3',
'title': '孤單南半球-歐德陽',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'duration': 247.246,
'timestamp': 1314932940,
'upload_date': '20110902',
'id': '25925099',
'ext': 'mp4',
'title': 'BigBuckBunny_320x180',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'duration': 596.458,
'timestamp': 1454242500,
'upload_date': '20160131',
'ext': 'mp4',
'title': '暗殺教室 02',
'description': '字幕:【極影字幕社】',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'duration': 1384.907,
'timestamp': 1421481240,
'upload_date': '20150117',
'ext': 'mp4',
'timestamp': 1416391590,
'upload_date': '20141119',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
}
}
'title': '少女时代_PARTY_Music Video Teaser',
'creator': '少女时代',
'duration': 25,
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
},
}, {
'url': 'http://v.yinyuetai.com/video/h5/2322376',
'id': 'L-11659-99244',
'ext': 'flv',
'title': 'איש לא יודע מאיפה באנו',
- 'thumbnail': 're:^https?://.*\.jpg',
+ 'thumbnail': r're:^https?://.*\.jpg',
}
}, {
'url': 'http://hot.ynet.co.il/home/0,7340,L-8859-84418,00.html',
'id': 'L-8859-84418',
'ext': 'flv',
'title': "צפו: הנשיקה הלוהטת של תורגי' ויוליה פלוטקין",
- 'thumbnail': 're:^https?://.*\.jpg',
+ 'thumbnail': r're:^https?://.*\.jpg',
}
}
]
'ext': 'mp4',
'title': 'Sex Ed: Is It Safe To Masturbate Daily?',
'description': 'Love & Sex Answers: http://bit.ly/DanAndJenn -- Is It Unhealthy To Masturbate Daily?',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'uploader': 'Ask Dan And Jennifer',
'upload_date': '20101221',
'average_rating': int,
'ext': 'mp4',
'title': 'Big Tits Awesome Brunette On amazing webcam show',
'description': 'http://sweetlivegirls.com Big Tits Awesome Brunette On amazing webcam show.mp4',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'uploader': 'Unknown',
'upload_date': '20111125',
'average_rating': int,
from __future__ import unicode_literals
from .common import InfoExtractor
+from ..utils import urljoin
class YourUploadIE(InfoExtractor):
- _VALID_URL = r'''(?x)https?://(?:www\.)?
- (?:yourupload\.com/watch|
- embed\.yourupload\.com|
- embed\.yucache\.net
- )/(?P<id>[A-Za-z0-9]+)
- '''
- _TESTS = [
- {
- 'url': 'http://yourupload.com/watch/14i14h',
- 'md5': '5e2c63385454c557f97c4c4131a393cd',
- 'info_dict': {
- 'id': '14i14h',
- 'ext': 'mp4',
- 'title': 'BigBuckBunny_320x180.mp4',
- 'thumbnail': 're:^https?://.*\.jpe?g',
- }
- },
- {
- 'url': 'http://embed.yourupload.com/14i14h',
- 'only_matching': True,
- },
- {
- 'url': 'http://embed.yucache.net/14i14h?client_file_id=803349',
- 'only_matching': True,
- },
- ]
+ _VALID_URL = r'https?://(?:www\.)?(?:yourupload\.com/(?:watch|embed)|embed\.yourupload\.com)/(?P<id>[A-Za-z0-9]+)'
+ _TESTS = [{
+ 'url': 'http://yourupload.com/watch/14i14h',
+ 'md5': '5e2c63385454c557f97c4c4131a393cd',
+ 'info_dict': {
+ 'id': '14i14h',
+ 'ext': 'mp4',
+ 'title': 'BigBuckBunny_320x180.mp4',
+ 'thumbnail': r're:^https?://.*\.jpe?g',
+ }
+ }, {
+ 'url': 'http://www.yourupload.com/embed/14i14h',
+ 'only_matching': True,
+ }, {
+ 'url': 'http://embed.yourupload.com/14i14h',
+ 'only_matching': True,
+ }]
def _real_extract(self, url):
video_id = self._match_id(url)
- embed_url = 'http://embed.yucache.net/{0:}'.format(video_id)
+ embed_url = 'http://www.yourupload.com/embed/%s' % video_id
+
webpage = self._download_webpage(embed_url, video_id)
title = self._og_search_title(webpage)
- video_url = self._og_search_video_url(webpage)
+ video_url = urljoin(embed_url, self._og_search_video_url(webpage))
thumbnail = self._og_search_thumbnail(webpage, default=None)
return {
sanitized_Request,
smuggle_url,
str_to_int,
+ try_get,
unescapeHTML,
unified_strdate,
unsmuggle_url,
'137': {'ext': 'mp4', 'height': 1080, 'format_note': 'DASH video', 'vcodec': 'h264', 'preference': -40},
'138': {'ext': 'mp4', 'format_note': 'DASH video', 'vcodec': 'h264', 'preference': -40}, # Height can vary (https://github.com/rg3/youtube-dl/issues/4559)
'160': {'ext': 'mp4', 'height': 144, 'format_note': 'DASH video', 'vcodec': 'h264', 'preference': -40},
+ '212': {'ext': 'mp4', 'height': 480, 'format_note': 'DASH video', 'vcodec': 'h264', 'preference': -40},
'264': {'ext': 'mp4', 'height': 1440, 'format_note': 'DASH video', 'vcodec': 'h264', 'preference': -40},
'298': {'ext': 'mp4', 'height': 720, 'format_note': 'DASH video', 'vcodec': 'h264', 'fps': 60, 'preference': -40},
'299': {'ext': 'mp4', 'height': 1080, 'format_note': 'DASH video', 'vcodec': 'h264', 'fps': 60, 'preference': -40},
'141': {'ext': 'm4a', 'format_note': 'DASH audio', 'acodec': 'aac', 'abr': 256, 'preference': -50, 'container': 'm4a_dash'},
'256': {'ext': 'm4a', 'format_note': 'DASH audio', 'acodec': 'aac', 'preference': -50, 'container': 'm4a_dash'},
'258': {'ext': 'm4a', 'format_note': 'DASH audio', 'acodec': 'aac', 'preference': -50, 'container': 'm4a_dash'},
+ '325': {'ext': 'm4a', 'format_note': 'DASH audio', 'acodec': 'dtse', 'preference': -50, 'container': 'm4a_dash'},
+ '328': {'ext': 'm4a', 'format_note': 'DASH audio', 'acodec': 'ec-3', 'preference': -50, 'container': 'm4a_dash'},
# Dash webm
'167': {'ext': 'webm', 'height': 360, 'width': 640, 'format_note': 'DASH video', 'container': 'webm', 'vcodec': 'vp8', 'preference': -40},
'title': 'youtube-dl test video "\'/\\ä↭𝕐',
'uploader': 'Philipp Hagemeister',
'uploader_id': 'phihag',
- 'uploader_url': 're:https?://(?:www\.)?youtube\.com/user/phihag',
+ 'uploader_url': r're:https?://(?:www\.)?youtube\.com/user/phihag',
'upload_date': '20121002',
'license': 'Standard YouTube License',
'description': 'test chars: "\'/\\ä↭𝕐\ntest URL: https://github.com/rg3/youtube-dl/issues/1892\n\nThis is a test video for youtube-dl.\n\nFor more information, contact phihag@phihag.de .',
'categories': ['Science & Technology'],
'tags': ['youtube-dl'],
+ 'duration': 10,
'like_count': int,
'dislike_count': int,
'start_time': 1,
'tags': ['Icona Pop i love it', 'sweden', 'pop music', 'big beat records', 'big beat', 'charli',
'xcx', 'charli xcx', 'girls', 'hbo', 'i love it', "i don't care", 'icona', 'pop',
'iconic ep', 'iconic', 'love', 'it'],
+ 'duration': 180,
'uploader': 'Icona Pop',
'uploader_id': 'IconaPop',
- 'uploader_url': 're:https?://(?:www\.)?youtube\.com/user/IconaPop',
+ 'uploader_url': r're:https?://(?:www\.)?youtube\.com/user/IconaPop',
'license': 'Standard YouTube License',
'creator': 'Icona Pop',
}
'title': 'Justin Timberlake - Tunnel Vision (Explicit)',
'alt_title': 'Tunnel Vision',
'description': 'md5:64249768eec3bc4276236606ea996373',
+ 'duration': 419,
'uploader': 'justintimberlakeVEVO',
'uploader_id': 'justintimberlakeVEVO',
- 'uploader_url': 're:https?://(?:www\.)?youtube\.com/user/justintimberlakeVEVO',
+ 'uploader_url': r're:https?://(?:www\.)?youtube\.com/user/justintimberlakeVEVO',
'license': 'Standard YouTube License',
'creator': 'Justin Timberlake',
'age_limit': 18,
'description': 'md5:09b78bd971f1e3e289601dfba15ca4f7',
'uploader': 'SET India',
'uploader_id': 'setindia',
- 'uploader_url': 're:https?://(?:www\.)?youtube\.com/user/setindia',
+ 'uploader_url': r're:https?://(?:www\.)?youtube\.com/user/setindia',
'license': 'Standard YouTube License',
'age_limit': 18,
}
'title': 'youtube-dl test video "\'/\\ä↭𝕐',
'uploader': 'Philipp Hagemeister',
'uploader_id': 'phihag',
- 'uploader_url': 're:https?://(?:www\.)?youtube\.com/user/phihag',
+ 'uploader_url': r're:https?://(?:www\.)?youtube\.com/user/phihag',
'upload_date': '20121002',
'license': 'Standard YouTube License',
'description': 'test chars: "\'/\\ä↭𝕐\ntest URL: https://github.com/rg3/youtube-dl/issues/1892\n\nThis is a test video for youtube-dl.\n\nFor more information, contact phihag@phihag.de .',
'categories': ['Science & Technology'],
'tags': ['youtube-dl'],
+ 'duration': 10,
'like_count': int,
'dislike_count': int,
},
'ext': 'm4a',
'upload_date': '20121002',
'uploader_id': '8KVIDEO',
- 'uploader_url': 're:https?://(?:www\.)?youtube\.com/user/8KVIDEO',
+ 'uploader_url': r're:https?://(?:www\.)?youtube\.com/user/8KVIDEO',
'description': '',
'uploader': '8KVIDEO',
'license': 'Standard YouTube License',
'ext': 'm4a',
'title': 'Afrojack, Spree Wilson - The Spark ft. Spree Wilson',
'description': 'md5:12e7067fa6735a77bdcbb58cb1187d2d',
+ 'duration': 244,
'uploader': 'AfrojackVEVO',
'uploader_id': 'AfrojackVEVO',
'upload_date': '20131011',
'title': 'Taylor Swift - Shake It Off',
'alt_title': 'Shake It Off',
'description': 'md5:95f66187cd7c8b2c13eb78e1223b63c3',
+ 'duration': 242,
'uploader': 'TaylorSwiftVEVO',
'uploader_id': 'TaylorSwiftVEVO',
'upload_date': '20140818',
'info_dict': {
'id': 'T4XJQO3qol8',
'ext': 'mp4',
+ 'duration': 219,
'upload_date': '20100909',
'uploader': 'The Amazing Atheist',
'uploader_id': 'TheAmazingAtheist',
- 'uploader_url': 're:https?://(?:www\.)?youtube\.com/user/TheAmazingAtheist',
+ 'uploader_url': r're:https?://(?:www\.)?youtube\.com/user/TheAmazingAtheist',
'license': 'Standard YouTube License',
'title': 'Burning Everyone\'s Koran',
'description': 'SUBSCRIBE: http://www.youtube.com/saturninefilms\n\nEven Obama has taken a stand against freedom on this issue: http://www.huffingtonpost.com/2010/09/09/obama-gma-interview-quran_n_710282.html',
'id': 'HtVdAasjOgU',
'ext': 'mp4',
'title': 'The Witcher 3: Wild Hunt - The Sword Of Destiny Trailer',
- 'description': 're:(?s).{100,}About the Game\n.*?The Witcher 3: Wild Hunt.{100,}',
+ 'description': r're:(?s).{100,}About the Game\n.*?The Witcher 3: Wild Hunt.{100,}',
+ 'duration': 142,
'uploader': 'The Witcher',
'uploader_id': 'WitcherGame',
- 'uploader_url': 're:https?://(?:www\.)?youtube\.com/user/WitcherGame',
+ 'uploader_url': r're:https?://(?:www\.)?youtube\.com/user/WitcherGame',
'upload_date': '20140605',
'license': 'Standard YouTube License',
'age_limit': 18,
'ext': 'mp4',
'title': 'Dedication To My Ex (Miss That) (Lyric Video)',
'description': 'md5:33765bb339e1b47e7e72b5490139bb41',
+ 'duration': 247,
'uploader': 'LloydVEVO',
'uploader_id': 'LloydVEVO',
- 'uploader_url': 're:https?://(?:www\.)?youtube\.com/user/LloydVEVO',
+ 'uploader_url': r're:https?://(?:www\.)?youtube\.com/user/LloydVEVO',
'upload_date': '20110629',
'license': 'Standard YouTube License',
'age_limit': 18,
'info_dict': {
'id': '__2ABJjxzNo',
'ext': 'mp4',
+ 'duration': 266,
'upload_date': '20100430',
'uploader_id': 'deadmau5',
- 'uploader_url': 're:https?://(?:www\.)?youtube\.com/user/deadmau5',
+ 'uploader_url': r're:https?://(?:www\.)?youtube\.com/user/deadmau5',
'creator': 'deadmau5',
'description': 'md5:12c56784b8032162bb936a5f76d55360',
'uploader': 'deadmau5',
'info_dict': {
'id': 'lqQg6PlCWgI',
'ext': 'mp4',
+ 'duration': 6085,
'upload_date': '20150827',
'uploader_id': 'olympic',
- 'uploader_url': 're:https?://(?:www\.)?youtube\.com/user/olympic',
+ 'uploader_url': r're:https?://(?:www\.)?youtube\.com/user/olympic',
'license': 'Standard YouTube License',
'description': 'HO09 - Women - GER-AUS - Hockey - 31 July 2012 - London 2012 Olympic Games',
'uploader': 'Olympic',
'id': '_b-2C3KPAM0',
'ext': 'mp4',
'stretched_ratio': 16 / 9.,
+ 'duration': 85,
'upload_date': '20110310',
'uploader_id': 'AllenMeow',
- 'uploader_url': 're:https?://(?:www\.)?youtube\.com/user/AllenMeow',
+ 'uploader_url': r're:https?://(?:www\.)?youtube\.com/user/AllenMeow',
'description': 'made by Wacom from Korea | 字幕&加油添醋 by TY\'s Allen | 感謝heylisa00cavey1001同學熱情提供梗及翻譯',
'uploader': '孫艾倫',
'license': 'Standard YouTube License',
'ext': 'mp4',
'title': 'md5:7b81415841e02ecd4313668cde88737a',
'description': 'md5:116377fd2963b81ec4ce64b542173306',
+ 'duration': 220,
'upload_date': '20150625',
'uploader_id': 'dorappi2000',
- 'uploader_url': 're:https?://(?:www\.)?youtube\.com/user/dorappi2000',
+ 'uploader_url': r're:https?://(?:www\.)?youtube\.com/user/dorappi2000',
'uploader': 'dorappi2000',
'license': 'Standard YouTube License',
'formats': 'mincount:32',
'ext': 'mp4',
'title': 'teamPGP: Rocket League Noob Stream (Main Camera)',
'description': 'md5:dc7872fb300e143831327f1bae3af010',
+ 'duration': 7335,
'upload_date': '20150721',
'uploader': 'Beer Games Beer',
'uploader_id': 'beergamesbeer',
- 'uploader_url': 're:https?://(?:www\.)?youtube\.com/user/beergamesbeer',
+ 'uploader_url': r're:https?://(?:www\.)?youtube\.com/user/beergamesbeer',
'license': 'Standard YouTube License',
},
}, {
'ext': 'mp4',
'title': 'teamPGP: Rocket League Noob Stream (kreestuh)',
'description': 'md5:dc7872fb300e143831327f1bae3af010',
+ 'duration': 7337,
'upload_date': '20150721',
'uploader': 'Beer Games Beer',
'uploader_id': 'beergamesbeer',
- 'uploader_url': 're:https?://(?:www\.)?youtube\.com/user/beergamesbeer',
+ 'uploader_url': r're:https?://(?:www\.)?youtube\.com/user/beergamesbeer',
'license': 'Standard YouTube License',
},
}, {
'ext': 'mp4',
'title': 'teamPGP: Rocket League Noob Stream (grizzle)',
'description': 'md5:dc7872fb300e143831327f1bae3af010',
+ 'duration': 7337,
'upload_date': '20150721',
'uploader': 'Beer Games Beer',
'uploader_id': 'beergamesbeer',
- 'uploader_url': 're:https?://(?:www\.)?youtube\.com/user/beergamesbeer',
+ 'uploader_url': r're:https?://(?:www\.)?youtube\.com/user/beergamesbeer',
'license': 'Standard YouTube License',
},
}, {
'ext': 'mp4',
'title': 'teamPGP: Rocket League Noob Stream (zim)',
'description': 'md5:dc7872fb300e143831327f1bae3af010',
+ 'duration': 7334,
'upload_date': '20150721',
'uploader': 'Beer Games Beer',
'uploader_id': 'beergamesbeer',
- 'uploader_url': 're:https?://(?:www\.)?youtube\.com/user/beergamesbeer',
+ 'uploader_url': r're:https?://(?:www\.)?youtube\.com/user/beergamesbeer',
'license': 'Standard YouTube License',
},
}],
'title': '{dark walk}; Loki/AC/Dishonored; collab w/Elflover21',
'alt_title': 'Dark Walk',
'description': 'md5:8085699c11dc3f597ce0410b0dcbb34a',
+ 'duration': 133,
'upload_date': '20151119',
'uploader_id': 'IronSoulElf',
- 'uploader_url': 're:https?://(?:www\.)?youtube\.com/user/IronSoulElf',
+ 'uploader_url': r're:https?://(?:www\.)?youtube\.com/user/IronSoulElf',
'uploader': 'IronSoulElf',
'license': 'Standard YouTube License',
'creator': 'Todd Haberman, Daniel Law Heath & Aaron Kaplan',
'ext': 'mp4',
'title': 'md5:e41008789470fc2533a3252216f1c1d1',
'description': 'md5:a677553cf0840649b731a3024aeff4cc',
+ 'duration': 721,
'upload_date': '20150127',
'uploader_id': 'BerkmanCenter',
- 'uploader_url': 're:https?://(?:www\.)?youtube\.com/user/BerkmanCenter',
- 'uploader': 'BerkmanCenter',
+ 'uploader_url': r're:https?://(?:www\.)?youtube\.com/user/BerkmanCenter',
+ 'uploader': 'The Berkman Klein Center for Internet & Society',
'license': 'Creative Commons Attribution license (reuse allowed)',
},
'params': {
'ext': 'mp4',
'title': 'Democratic Socialism and Foreign Policy | Bernie Sanders',
'description': 'md5:dda0d780d5a6e120758d1711d062a867',
+ 'duration': 4060,
'upload_date': '20151119',
'uploader': 'Bernie 2016',
'uploader_id': 'UCH1dpzjCEiGAt8CXkryhkZg',
- 'uploader_url': 're:https?://(?:www\.)?youtube\.com/channel/UCH1dpzjCEiGAt8CXkryhkZg',
+ 'uploader_url': r're:https?://(?:www\.)?youtube\.com/channel/UCH1dpzjCEiGAt8CXkryhkZg',
'license': 'Creative Commons Attribution license (reuse allowed)',
},
'params': {
'upload_date': '20150811',
'uploader': 'FlixMatrix',
'uploader_id': 'FlixMatrixKaravan',
- 'uploader_url': 're:https?://(?:www\.)?youtube\.com/user/FlixMatrixKaravan',
+ 'uploader_url': r're:https?://(?:www\.)?youtube\.com/user/FlixMatrixKaravan',
'license': 'Standard YouTube License',
},
'params': {
'skip_download': True,
},
+ },
+ {
+ # YouTube Red video with episode data
+ 'url': 'https://www.youtube.com/watch?v=iqKdEhx-dD4',
+ 'info_dict': {
+ 'id': 'iqKdEhx-dD4',
+ 'ext': 'mp4',
+ 'title': 'Isolation - Mind Field (Ep 1)',
+ 'description': 'md5:8013b7ddea787342608f63a13ddc9492',
+ 'duration': 2085,
+ 'upload_date': '20170118',
+ 'uploader': 'Vsauce',
+ 'uploader_id': 'Vsauce',
+ 'uploader_url': r're:https?://(?:www\.)?youtube\.com/user/Vsauce',
+ 'license': 'Standard YouTube License',
+ 'series': 'Mind Field',
+ 'season_number': 1,
+ 'episode_number': 1,
+ },
+ 'params': {
+ 'skip_download': True,
+ },
+ 'expected_warnings': [
+ 'Skipping DASH manifest',
+ ],
+ },
+ {
+ # itag 212
+ 'url': '1t24XAntNCY',
+ 'only_matching': True,
}
]
def _parse_sig_js(self, jscode):
funcname = self._search_regex(
- r'\.sig\|\|([a-zA-Z0-9$]+)\(', jscode,
- 'Initial JS player signature function name')
+ (r'(["\'])signature\1\s*,\s*(?P<sig>[a-zA-Z0-9$]+)\(',
+ r'\.sig\|\|(?P<sig>[a-zA-Z0-9$]+)\('),
+ jscode, 'Initial JS player signature function name', group='sig')
jsi = JSInterpreter(jscode)
initial_function = jsi.extract_function(funcname)
if player_url.startswith('//'):
player_url = 'https:' + player_url
+ elif not re.match(r'https?://', player_url):
+ player_url = compat_urlparse.urljoin(
+ 'https://www.youtube.com', player_url)
try:
player_id = (player_url, self._signature_cache_id(s))
if player_id not in self._player_cache:
else:
video_alt_title = video_creator = None
+ m_episode = re.search(
+ r'<div[^>]+id="watch7-headline"[^>]*>\s*<span[^>]*>.*?>(?P<series>[^<]+)</a></b>\s*S(?P<season>\d+)\s*•\s*E(?P<episode>\d+)</span>',
+ video_webpage)
+ if m_episode:
+ series = m_episode.group('series')
+ season_number = int(m_episode.group('season'))
+ episode_number = int(m_episode.group('episode'))
+ else:
+ series = season_number = episode_number = None
+
m_cat_container = self._search_regex(
r'(?s)<h4[^>]*>\s*Category\s*</h4>\s*<ul[^>]*>(.*?)</ul>',
video_webpage, 'categories', default=None)
video_subtitles = self.extract_subtitles(video_id, video_webpage)
automatic_captions = self.extract_automatic_captions(video_id, video_webpage)
- if 'length_seconds' not in video_info:
- self._downloader.report_warning('unable to extract video duration')
- video_duration = None
- else:
- video_duration = int(compat_urllib_parse_unquote_plus(video_info['length_seconds'][0]))
+ video_duration = try_get(
+ video_info, lambda x: int_or_none(x['length_seconds'][0]))
+ if not video_duration:
+ video_duration = parse_duration(self._html_search_meta(
+ 'duration', video_webpage, 'video duration'))
# annotations
video_annotations = None
'is_live': is_live,
'start_time': start_time,
'end_time': end_time,
+ 'series': series,
+ 'season_number': season_number,
+ 'episode_number': episode_number,
}
youtu\.be/[0-9A-Za-z_-]{11}\?.*?\blist=
)
(
- (?:PL|LL|EC|UU|FL|RD|UL)?[0-9A-Za-z-_]{10,}
+ (?:PL|LL|EC|UU|FL|RD|UL|TL)?[0-9A-Za-z-_]{10,}
# Top tracks, they can also include dots
|(?:MC)[\w\.]*
)
.*
|
- ((?:PL|LL|EC|UU|FL|RD|UL)[0-9A-Za-z-_]{10,})
+ ((?:PL|LL|EC|UU|FL|RD|UL|TL)[0-9A-Za-z-_]{10,})
)"""
_TEMPLATE_URL = 'https://www.youtube.com/playlist?list=%s&disable_polymer=true'
_VIDEO_RE = r'href="\s*/watch\?v=(?P<id>[0-9A-Za-z_-]{11})&[^"]*?index=(?P<index>\d+)(?:[^>]+>(?P<title>[^<]+))?'
'title': 'YDL_Empty_List',
},
'playlist_count': 0,
+ 'skip': 'This playlist is private',
}, {
'note': 'Playlist with deleted videos (#651). As a bonus, the video #51 is also twice in this list.',
'url': 'https://www.youtube.com/playlist?list=PLwP_SiAcdui0KVebT0mU9Apz359a4ubsC',
'id': 'PLtPgu7CB4gbY9oDN3drwC3cMbJggS7dKl',
},
'playlist_count': 2,
+ 'skip': 'This playlist is private',
}, {
'note': 'embedded',
'url': 'https://www.youtube.com/embed/videoseries?list=PL6IaIsEjSbf96XFRuNccS_RuEXwNdsoEu',
'title': "Smiley's People 01 detective, Adventure Series, Action",
'uploader': 'STREEM',
'uploader_id': 'UCyPhqAZgwYWZfxElWVbVJng',
- 'uploader_url': 're:https?://(?:www\.)?youtube\.com/channel/UCyPhqAZgwYWZfxElWVbVJng',
+ 'uploader_url': r're:https?://(?:www\.)?youtube\.com/channel/UCyPhqAZgwYWZfxElWVbVJng',
'upload_date': '20150526',
'license': 'Standard YouTube License',
'description': 'md5:507cdcb5a49ac0da37a920ece610be80',
'title': 'Small Scale Baler and Braiding Rugs',
'uploader': 'Backus-Page House Museum',
'uploader_id': 'backuspagemuseum',
- 'uploader_url': 're:https?://(?:www\.)?youtube\.com/user/backuspagemuseum',
+ 'uploader_url': r're:https?://(?:www\.)?youtube\.com/user/backuspagemuseum',
'upload_date': '20161008',
'license': 'Standard YouTube License',
'description': 'md5:800c0c78d5eb128500bffd4f0b4f2e8a',
}, {
'url': 'https://youtu.be/uWyaPkt-VOI?list=PL9D9FC436B881BA21',
'only_matching': True,
+ }, {
+ 'url': 'TLGGrESM50VT6acwMjAyMjAxNw',
+ 'only_matching': True,
}]
def _real_initialize(self):
url = self._TEMPLATE_URL % playlist_id
page = self._download_webpage(url, playlist_id)
- for match in re.findall(r'<div class="yt-alert-message">([^<]+)</div>', page):
+ # the yt-alert-message now has tabindex attribute (see https://github.com/rg3/youtube-dl/issues/11604)
+ for match in re.findall(r'<div class="yt-alert-message"[^>]*>([^<]+)</div>', page):
match = match.strip()
# Check if the playlist exists or is private
- if re.match(r'[^<]*(The|This) playlist (does not exist|is private)[^<]*', match):
- raise ExtractorError(
- 'The playlist doesn\'t exist or is private, use --username or '
- '--netrc to access it.',
- expected=True)
+ mobj = re.match(r'[^<]*(?:The|This) playlist (?P<reason>does not exist|is private)[^<]*', match)
+ if mobj:
+ reason = mobj.group('reason')
+ message = 'This playlist %s' % reason
+ if 'private' in reason:
+ message += ', use --username or --netrc to access it'
+ message += '.'
+ raise ExtractorError(message, expected=True)
elif re.match(r'[^<]*Invalid parameters[^<]*', match):
raise ExtractorError(
'Invalid parameters. Maybe URL is incorrect.',
'title': 'The Young Turks - Live Main Show',
'uploader': 'The Young Turks',
'uploader_id': 'TheYoungTurks',
- 'uploader_url': 're:https?://(?:www\.)?youtube\.com/user/TheYoungTurks',
+ 'uploader_url': r're:https?://(?:www\.)?youtube\.com/user/TheYoungTurks',
'upload_date': '20150715',
'license': 'Standard YouTube License',
'description': 'md5:438179573adcdff3c97ebb1ee632b891',
videos = []
limit = n
+ url_query = {
+ 'search_query': query.encode('utf-8'),
+ }
+ url_query.update(self._EXTRA_QUERY_ARGS)
+ result_url = 'https://www.youtube.com/results?' + compat_urllib_parse_urlencode(url_query)
+
for pagenum in itertools.count(1):
- url_query = {
- 'search_query': query.encode('utf-8'),
- 'page': pagenum,
- 'spf': 'navigate',
- }
- url_query.update(self._EXTRA_QUERY_ARGS)
- result_url = 'https://www.youtube.com/results?' + compat_urllib_parse_urlencode(url_query)
data = self._download_json(
result_url, video_id='query "%s"' % query,
note='Downloading page %s' % pagenum,
- errnote='Unable to download API page')
+ errnote='Unable to download API page',
+ query={'spf': 'navigate'})
html_content = data[1]['body']['content']
if 'class="search-message' in html_content:
videos += new_videos
if not new_videos or len(videos) > limit:
break
+ next_link = self._html_search_regex(
+ r'href="(/results\?[^"]*\bsp=[^"]+)"[^>]*>\s*<span[^>]+class="[^"]*\byt-uix-button-content\b[^"]*"[^>]*>Next',
+ html_content, 'next link', default=None)
+ if next_link is None:
+ break
+ result_url = compat_urlparse.urljoin('https://www.youtube.com/', next_link)
if len(videos) > n:
videos = videos[:n]
'ext': 'mp4',
'title': 'EP2S3 - Bon Appétit - Eh bé viva les pyrénées con!',
'description': 'md5:7054d6f6f620c6519be1fe710d4da847',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
'duration': 528,
'timestamp': 1359044972,
'upload_date': '20130124',
# coding: utf-8
from __future__ import unicode_literals
-import functools
import re
from .common import InfoExtractor
+from ..compat import compat_str
from ..utils import (
- int_or_none,
- unified_strdate,
- OnDemandPagedList,
- xpath_text,
determine_ext,
+ int_or_none,
+ NO_DEFAULT,
+ orderedSet,
+ parse_codecs,
qualities,
- float_or_none,
- ExtractorError,
+ try_get,
+ unified_timestamp,
+ update_url_query,
+ urljoin,
)
-class ZDFIE(InfoExtractor):
- _VALID_URL = r'(?:zdf:|zdf:video:|https?://www\.zdf\.de/ZDFmediathek(?:#)?/(.*beitrag/(?:video/)?))(?P<id>[0-9]+)(?:/[^/?]+)?(?:\?.*)?'
+class ZDFBaseIE(InfoExtractor):
+ def _call_api(self, url, player, referrer, video_id):
+ return self._download_json(
+ url, video_id, 'Downloading JSON content',
+ headers={
+ 'Referer': referrer,
+ 'Api-Auth': 'Bearer %s' % player['apiToken'],
+ })
+
+ def _extract_player(self, webpage, video_id, fatal=True):
+ return self._parse_json(
+ self._search_regex(
+ r'(?s)data-zdfplayer-jsb=(["\'])(?P<json>{.+?})\1', webpage,
+ 'player JSON', default='{}' if not fatal else NO_DEFAULT,
+ group='json'),
+ video_id)
+
+
+class ZDFIE(ZDFBaseIE):
+ _VALID_URL = r'https?://www\.zdf\.de/(?:[^/]+/)*(?P<id>[^/?]+)\.html'
+ _QUALITIES = ('auto', 'low', 'med', 'high', 'veryhigh')
_TESTS = [{
- 'url': 'http://www.zdf.de/ZDFmediathek/beitrag/video/2037704/ZDFspezial---Ende-des-Machtpokers--?bc=sts;stt',
+ 'url': 'https://www.zdf.de/service-und-hilfe/die-neue-zdf-mediathek/zdfmediathek-trailer-100.html',
'info_dict': {
- 'id': '2037704',
- 'ext': 'webm',
- 'title': 'ZDFspezial - Ende des Machtpokers',
- 'description': 'Union und SPD haben sich auf einen Koalitionsvertrag geeinigt. Aber was bedeutet das für die Bürger? Sehen Sie hierzu das ZDFspezial "Ende des Machtpokers - Große Koalition für Deutschland".',
- 'duration': 1022,
- 'uploader': 'spezial',
- 'uploader_id': '225948',
- 'upload_date': '20131127',
- },
- 'skip': 'Videos on ZDF.de are depublicised in short order',
+ 'id': 'zdfmediathek-trailer-100',
+ 'ext': 'mp4',
+ 'title': 'Die neue ZDFmediathek',
+ 'description': 'md5:3003d36487fb9a5ea2d1ff60beb55e8d',
+ 'duration': 30,
+ 'timestamp': 1477627200,
+ 'upload_date': '20161028',
+ }
+ }, {
+ 'url': 'https://www.zdf.de/filme/taunuskrimi/die-lebenden-und-die-toten-1---ein-taunuskrimi-100.html',
+ 'only_matching': True,
+ }, {
+ 'url': 'https://www.zdf.de/dokumentation/planet-e/planet-e-uebersichtsseite-weitere-dokumentationen-von-planet-e-100.html',
+ 'only_matching': True,
}]
- def _parse_smil_formats(self, smil, smil_url, video_id, namespace=None, f4m_params=None, transform_rtmp_url=None):
- param_groups = {}
- for param_group in smil.findall(self._xpath_ns('./head/paramGroup', namespace)):
- group_id = param_group.attrib.get(self._xpath_ns('id', 'http://www.w3.org/XML/1998/namespace'))
- params = {}
- for param in param_group:
- params[param.get('name')] = param.get('value')
- param_groups[group_id] = params
+ @staticmethod
+ def _extract_subtitles(src):
+ subtitles = {}
+ for caption in try_get(src, lambda x: x['captions'], list) or []:
+ subtitle_url = caption.get('uri')
+ if subtitle_url and isinstance(subtitle_url, compat_str):
+ lang = caption.get('language', 'deu')
+ subtitles.setdefault(lang, []).append({
+ 'url': subtitle_url,
+ })
+ return subtitles
+
+ def _extract_format(self, video_id, formats, format_urls, meta):
+ format_url = meta.get('url')
+ if not format_url or not isinstance(format_url, compat_str):
+ return
+ if format_url in format_urls:
+ return
+ format_urls.add(format_url)
+ mime_type = meta.get('mimeType')
+ ext = determine_ext(format_url)
+ if mime_type == 'application/x-mpegURL' or ext == 'm3u8':
+ formats.extend(self._extract_m3u8_formats(
+ format_url, video_id, 'mp4', m3u8_id='hls',
+ entry_protocol='m3u8_native', fatal=False))
+ elif mime_type == 'application/f4m+xml' or ext == 'f4m':
+ formats.extend(self._extract_f4m_formats(
+ update_url_query(format_url, {'hdcore': '3.7.0'}), video_id, f4m_id='hds', fatal=False))
+ else:
+ f = parse_codecs(meta.get('mimeCodec'))
+ format_id = ['http']
+ for p in (meta.get('type'), meta.get('quality')):
+ if p and isinstance(p, compat_str):
+ format_id.append(p)
+ f.update({
+ 'url': format_url,
+ 'format_id': '-'.join(format_id),
+ 'format_note': meta.get('quality'),
+ 'language': meta.get('language'),
+ 'quality': qualities(self._QUALITIES)(meta.get('quality')),
+ 'preference': -10,
+ })
+ formats.append(f)
+
+ def _extract_entry(self, url, content, video_id):
+ title = content.get('title') or content['teaserHeadline']
+
+ t = content['mainVideoContent']['http://zdf.de/rels/target']
+
+ ptmd_path = t.get('http://zdf.de/rels/streams/ptmd')
+
+ if not ptmd_path:
+ ptmd_path = t[
+ 'http://zdf.de/rels/streams/ptmd-template'].replace(
+ '{playerId}', 'portal')
+
+ ptmd = self._download_json(urljoin(url, ptmd_path), video_id)
formats = []
- for video in smil.findall(self._xpath_ns('.//video', namespace)):
- src = video.get('src')
- if not src:
+ track_uris = set()
+ for p in ptmd['priorityList']:
+ formitaeten = p.get('formitaeten')
+ if not isinstance(formitaeten, list):
continue
- bitrate = float_or_none(video.get('system-bitrate') or video.get('systemBitrate'), 1000)
- group_id = video.get('paramGroup')
- param_group = param_groups[group_id]
- for proto in param_group['protocols'].split(','):
- formats.append({
- 'url': '%s://%s' % (proto, param_group['host']),
- 'app': param_group['app'],
- 'play_path': src,
- 'ext': 'flv',
- 'format_id': '%s-%d' % (proto, bitrate),
- 'tbr': bitrate,
- })
+ for f in formitaeten:
+ f_qualities = f.get('qualities')
+ if not isinstance(f_qualities, list):
+ continue
+ for quality in f_qualities:
+ tracks = try_get(quality, lambda x: x['audio']['tracks'], list)
+ if not tracks:
+ continue
+ for track in tracks:
+ self._extract_format(
+ video_id, formats, track_uris, {
+ 'url': track.get('uri'),
+ 'type': f.get('type'),
+ 'mimeType': f.get('mimeType'),
+ 'quality': quality.get('quality'),
+ 'language': track.get('language'),
+ })
self._sort_formats(formats)
- return formats
-
- def extract_from_xml_url(self, video_id, xml_url):
- doc = self._download_xml(
- xml_url, video_id,
- note='Downloading video info',
- errnote='Failed to download video info')
-
- status_code = doc.find('./status/statuscode')
- if status_code is not None and status_code.text != 'ok':
- code = status_code.text
- if code == 'notVisibleAnymore':
- message = 'Video %s is not available' % video_id
- else:
- message = '%s returned error: %s' % (self.IE_NAME, code)
- raise ExtractorError(message, expected=True)
-
- title = doc.find('.//information/title').text
- description = xpath_text(doc, './/information/detail', 'description')
- duration = int_or_none(xpath_text(doc, './/details/lengthSec', 'duration'))
- uploader = xpath_text(doc, './/details/originChannelTitle', 'uploader')
- uploader_id = xpath_text(doc, './/details/originChannelId', 'uploader id')
- upload_date = unified_strdate(xpath_text(doc, './/details/airtime', 'upload date'))
- subtitles = {}
- captions_url = doc.find('.//caption/url')
- if captions_url is not None:
- subtitles['de'] = [{
- 'url': captions_url.text,
- 'ext': 'ttml',
- }]
-
- def xml_to_thumbnails(fnode):
- thumbnails = []
- for node in fnode:
- thumbnail_url = node.text
- if not thumbnail_url:
+
+ thumbnails = []
+ layouts = try_get(
+ content, lambda x: x['teaserImageRef']['layouts'], dict)
+ if layouts:
+ for layout_key, layout_url in layouts.items():
+ if not isinstance(layout_url, compat_str):
continue
thumbnail = {
- 'url': thumbnail_url,
+ 'url': layout_url,
+ 'format_id': layout_key,
}
- if 'key' in node.attrib:
- m = re.match('^([0-9]+)x([0-9]+)$', node.attrib['key'])
- if m:
- thumbnail['width'] = int(m.group(1))
- thumbnail['height'] = int(m.group(2))
+ mobj = re.search(r'(?P<width>\d+)x(?P<height>\d+)', layout_key)
+ if mobj:
+ thumbnail.update({
+ 'width': int(mobj.group('width')),
+ 'height': int(mobj.group('height')),
+ })
thumbnails.append(thumbnail)
- return thumbnails
- thumbnails = xml_to_thumbnails(doc.findall('.//teaserimages/teaserimage'))
+ return {
+ 'id': video_id,
+ 'title': title,
+ 'description': content.get('leadParagraph') or content.get('teasertext'),
+ 'duration': int_or_none(t.get('duration')),
+ 'timestamp': unified_timestamp(content.get('editorialDate')),
+ 'thumbnails': thumbnails,
+ 'subtitles': self._extract_subtitles(ptmd),
+ 'formats': formats,
+ }
- format_nodes = doc.findall('.//formitaeten/formitaet')
- quality = qualities(['veryhigh', 'high', 'med', 'low'])
+ def _extract_regular(self, url, player, video_id):
+ content = self._call_api(player['content'], player, url, video_id)
+ return self._extract_entry(player['content'], content, video_id)
- def get_quality(elem):
- return quality(xpath_text(elem, 'quality'))
- format_nodes.sort(key=get_quality)
- format_ids = []
- formats = []
- for fnode in format_nodes:
- video_url = fnode.find('url').text
- is_available = 'http://www.metafilegenerator' not in video_url
- if not is_available:
- continue
- format_id = fnode.attrib['basetype']
- quality = xpath_text(fnode, './quality', 'quality')
- format_m = re.match(r'''(?x)
- (?P<vcodec>[^_]+)_(?P<acodec>[^_]+)_(?P<container>[^_]+)_
- (?P<proto>[^_]+)_(?P<index>[^_]+)_(?P<indexproto>[^_]+)
- ''', format_id)
-
- ext = determine_ext(video_url, None) or format_m.group('container')
- if ext not in ('smil', 'f4m', 'm3u8'):
- format_id = format_id + '-' + quality
- if format_id in format_ids:
- continue
+ def _extract_mobile(self, video_id):
+ document = self._download_json(
+ 'https://zdf-cdn.live.cellular.de/mediathekV2/document/%s' % video_id,
+ video_id)['document']
- if ext == 'meta':
- continue
- elif ext == 'smil':
- formats.extend(self._extract_smil_formats(
- video_url, video_id, fatal=False))
- elif ext == 'm3u8':
- # the certificates are misconfigured (see
- # https://github.com/rg3/youtube-dl/issues/8665)
- if video_url.startswith('https://'):
- continue
- formats.extend(self._extract_m3u8_formats(
- video_url, video_id, 'mp4', m3u8_id=format_id, fatal=False))
- elif ext == 'f4m':
- formats.extend(self._extract_f4m_formats(
- video_url, video_id, f4m_id=format_id, fatal=False))
- else:
- proto = format_m.group('proto').lower()
-
- abr = int_or_none(xpath_text(fnode, './audioBitrate', 'abr'), 1000)
- vbr = int_or_none(xpath_text(fnode, './videoBitrate', 'vbr'), 1000)
-
- width = int_or_none(xpath_text(fnode, './width', 'width'))
- height = int_or_none(xpath_text(fnode, './height', 'height'))
-
- filesize = int_or_none(xpath_text(fnode, './filesize', 'filesize'))
-
- format_note = ''
- if not format_note:
- format_note = None
-
- formats.append({
- 'format_id': format_id,
- 'url': video_url,
- 'ext': ext,
- 'acodec': format_m.group('acodec'),
- 'vcodec': format_m.group('vcodec'),
- 'abr': abr,
- 'vbr': vbr,
- 'width': width,
- 'height': height,
- 'filesize': filesize,
- 'format_note': format_note,
- 'protocol': proto,
- '_available': is_available,
- })
- format_ids.append(format_id)
+ title = document['titel']
+ formats = []
+ format_urls = set()
+ for f in document['formitaeten']:
+ self._extract_format(video_id, formats, format_urls, f)
self._sort_formats(formats)
+ thumbnails = []
+ teaser_bild = document.get('teaserBild')
+ if isinstance(teaser_bild, dict):
+ for thumbnail_key, thumbnail in teaser_bild.items():
+ thumbnail_url = try_get(
+ thumbnail, lambda x: x['url'], compat_str)
+ if thumbnail_url:
+ thumbnails.append({
+ 'url': thumbnail_url,
+ 'id': thumbnail_key,
+ 'width': int_or_none(thumbnail.get('width')),
+ 'height': int_or_none(thumbnail.get('height')),
+ })
+
return {
'id': video_id,
'title': title,
- 'description': description,
- 'duration': duration,
+ 'description': document.get('beschreibung'),
+ 'duration': int_or_none(document.get('length')),
+ 'timestamp': unified_timestamp(try_get(
+ document, lambda x: x['meta']['editorialDate'], compat_str)),
'thumbnails': thumbnails,
- 'uploader': uploader,
- 'uploader_id': uploader_id,
- 'upload_date': upload_date,
+ 'subtitles': self._extract_subtitles(document),
'formats': formats,
- 'subtitles': subtitles,
}
def _real_extract(self, url):
video_id = self._match_id(url)
- xml_url = 'http://www.zdf.de/ZDFmediathek/xmlservice/web/beitragsDetails?ak=web&id=%s' % video_id
- return self.extract_from_xml_url(video_id, xml_url)
+ webpage = self._download_webpage(url, video_id, fatal=False)
+ if webpage:
+ player = self._extract_player(webpage, url, fatal=False)
+ if player:
+ return self._extract_regular(url, player, video_id)
+
+ return self._extract_mobile(video_id)
-class ZDFChannelIE(InfoExtractor):
- _VALID_URL = r'(?:zdf:topic:|https?://www\.zdf\.de/ZDFmediathek(?:#)?/.*kanaluebersicht/(?:[^/]+/)?)(?P<id>[0-9]+)'
+
+class ZDFChannelIE(ZDFBaseIE):
+ _VALID_URL = r'https?://www\.zdf\.de/(?:[^/]+/)*(?P<id>[^/?#&]+)'
_TESTS = [{
- 'url': 'http://www.zdf.de/ZDFmediathek#/kanaluebersicht/1586442/sendung/Titanic',
+ 'url': 'https://www.zdf.de/sport/das-aktuelle-sportstudio',
'info_dict': {
- 'id': '1586442',
+ 'id': 'das-aktuelle-sportstudio',
+ 'title': 'das aktuelle sportstudio | ZDF',
},
- 'playlist_count': 3,
- }, {
- 'url': 'http://www.zdf.de/ZDFmediathek/kanaluebersicht/aktuellste/332',
- 'only_matching': True,
+ 'playlist_count': 21,
}, {
- 'url': 'http://www.zdf.de/ZDFmediathek/kanaluebersicht/meist-gesehen/332',
- 'only_matching': True,
+ 'url': 'https://www.zdf.de/dokumentation/planet-e',
+ 'info_dict': {
+ 'id': 'planet-e',
+ 'title': 'planet e.',
+ },
+ 'playlist_count': 4,
}, {
- 'url': 'http://www.zdf.de/ZDFmediathek/kanaluebersicht/_/1798716?bc=nrt;nrm?flash=off',
+ 'url': 'https://www.zdf.de/filme/taunuskrimi/',
'only_matching': True,
}]
- _PAGE_SIZE = 50
-
- def _fetch_page(self, channel_id, page):
- offset = page * self._PAGE_SIZE
- xml_url = (
- 'http://www.zdf.de/ZDFmediathek/xmlservice/web/aktuellste?ak=web&offset=%d&maxLength=%d&id=%s'
- % (offset, self._PAGE_SIZE, channel_id))
- doc = self._download_xml(
- xml_url, channel_id,
- note='Downloading channel info',
- errnote='Failed to download channel info')
-
- title = doc.find('.//information/title').text
- description = doc.find('.//information/detail').text
- for asset in doc.findall('.//teasers/teaser'):
- a_type = asset.find('./type').text
- a_id = asset.find('./details/assetId').text
- if a_type not in ('video', 'topic'):
- continue
- yield {
- '_type': 'url',
- 'playlist_title': title,
- 'playlist_description': description,
- 'url': 'zdf:%s:%s' % (a_type, a_id),
- }
+
+ @classmethod
+ def suitable(cls, url):
+ return False if ZDFIE.suitable(url) else super(ZDFChannelIE, cls).suitable(url)
def _real_extract(self, url):
channel_id = self._match_id(url)
- entries = OnDemandPagedList(
- functools.partial(self._fetch_page, channel_id), self._PAGE_SIZE)
- return {
- '_type': 'playlist',
- 'id': channel_id,
- 'entries': entries,
- }
+ webpage = self._download_webpage(url, channel_id)
+
+ entries = [
+ self.url_result(item_url, ie=ZDFIE.ie_key())
+ for item_url in orderedSet(re.findall(
+ r'data-plusbar-url=["\'](http.+?\.html)', webpage))]
+
+ return self.playlist_result(
+ entries, channel_id, self._og_search_title(webpage, fatal=False))
+
+ r"""
+ player = self._extract_player(webpage, channel_id)
+
+ channel_id = self._search_regex(
+ r'docId\s*:\s*(["\'])(?P<id>(?!\1).+?)\1', webpage,
+ 'channel id', group='id')
+
+ channel = self._call_api(
+ 'https://api.zdf.de/content/documents/%s.json' % channel_id,
+ player, url, channel_id)
+
+ items = []
+ for module in channel['module']:
+ for teaser in try_get(module, lambda x: x['teaser'], list) or []:
+ t = try_get(
+ teaser, lambda x: x['http://zdf.de/rels/target'], dict)
+ if not t:
+ continue
+ items.extend(try_get(
+ t,
+ lambda x: x['resultsWithVideo']['http://zdf.de/rels/search/results'],
+ list) or [])
+ items.extend(try_get(
+ module,
+ lambda x: x['filterRef']['resultsWithVideo']['http://zdf.de/rels/search/results'],
+ list) or [])
+
+ entries = []
+ entry_urls = set()
+ for item in items:
+ t = try_get(item, lambda x: x['http://zdf.de/rels/target'], dict)
+ if not t:
+ continue
+ sharing_url = t.get('http://zdf.de/rels/sharing-url')
+ if not sharing_url or not isinstance(sharing_url, compat_str):
+ continue
+ if sharing_url in entry_urls:
+ continue
+ entry_urls.add(sharing_url)
+ entries.append(self.url_result(
+ sharing_url, ie=ZDFIE.ie_key(), video_id=t.get('id')))
+
+ return self.playlist_result(entries, channel_id, channel.get('title'))
+ """
'id': 'ZWZB9WAB',
'title': 'Xa Mãi Xa',
'ext': 'mp3',
- 'thumbnail': 're:^https?://.*\.jpg$',
+ 'thumbnail': r're:^https?://.*\.jpg$',
},
}, {
'url': 'http://mp3.zing.vn/video-clip/Let-It-Go-Frozen-OST-Sungha-Jung/ZW6BAEA0.html',
def extract_object(self, objname):
obj = {}
obj_m = re.search(
- (r'(?:var\s+)?%s\s*=\s*\{' % re.escape(objname)) +
+ (r'(?<!this\.)%s\s*=\s*\{' % re.escape(objname)) +
r'\s*(?P<fields>([a-zA-Z$0-9]+\s*:\s*function\(.*?\)\s*\{.*?\}(?:,\s*)?)*)' +
r'\}\s*;',
self.code)
'When given in the global configuration file /etc/youtube-dl.conf: '
'Do not read the user configuration in ~/.config/youtube-dl/config '
'(%APPDATA%/youtube-dl/config.txt on Windows)')
+ general.add_option(
+ '--config-location',
+ dest='config_location', metavar='PATH',
+ help='Location of the configuration file; either the path to the config or its containing directory.')
general.add_option(
'--flat-playlist',
action='store_const', dest='extract_flat', const='in_playlist',
network.add_option(
'--source-address',
metavar='IP', dest='source_address', default=None,
- help='Client-side IP address to bind to (experimental)',
+ help='Client-side IP address to bind to',
)
network.add_option(
'-4', '--force-ipv4',
action='store_const', const='0.0.0.0', dest='source_address',
- help='Make all connections via IPv4 (experimental)',
+ help='Make all connections via IPv4',
)
network.add_option(
'-6', '--force-ipv6',
action='store_const', const='::', dest='source_address',
- help='Make all connections via IPv6 (experimental)',
+ help='Make all connections via IPv6',
)
network.add_option(
'--geo-verification-proxy',
dest='geo_verification_proxy', default=None, metavar='URL',
help='Use this proxy to verify the IP address for some geo-restricted sites. '
- 'The default proxy specified by --proxy (or none, if the options is not present) is used for the actual downloading. (experimental)'
+ 'The default proxy specified by --proxy (or none, if the options is not present) is used for the actual downloading.'
)
network.add_option(
'--cn-verification-proxy',
'--match-filter',
metavar='FILTER', dest='match_filter', default=None,
help=(
- 'Generic video filter (experimental). '
+ 'Generic video filter. '
'Specify any key (see help for -o for a list of available keys) to'
' match if the key is present, '
'!key to check if the key is not present,'
authentication.add_option(
'-2', '--twofactor',
dest='twofactor', metavar='TWOFACTOR',
- help='Two-factor auth code')
+ help='Two-factor authentication code')
authentication.add_option(
'-n', '--netrc',
action='store_true', dest='usenetrc', default=False,
'--skip-unavailable-fragments',
action='store_true', dest='skip_unavailable_fragments', default=True,
help='Skip unavailable fragments (DASH and hlsnative only)')
- general.add_option(
+ downloader.add_option(
'--abort-on-unavailable-fragment',
action='store_false', dest='skip_unavailable_fragments',
help='Abort downloading when some fragment is not available')
'--playlist-reverse',
action='store_true',
help='Download playlist videos in reverse order')
+ downloader.add_option(
+ '--playlist-random',
+ action='store_true',
+ help='Download playlist videos in random order')
downloader.add_option(
'--xattr-set-filesize',
dest='xattr_set_filesize', action='store_true',
- help='Set file xattribute ytdl.filesize with expected filesize (experimental)')
+ help='Set file xattribute ytdl.filesize with expected file size (experimental)')
downloader.add_option(
'--hls-prefer-native',
dest='hls_prefer_native', action='store_true', default=None,
help=('Output filename template, see the "OUTPUT TEMPLATE" for all the info'))
filesystem.add_option(
'--autonumber-size',
- dest='autonumber_size', metavar='NUMBER',
- help='Specify the number of digits in %(autonumber)s when it is present in output filename template or --auto-number option is given')
+ dest='autonumber_size', metavar='NUMBER', default=5, type=int,
+ help='Specify the number of digits in %(autonumber)s when it is present in output filename template or --auto-number option is given (default is %default)')
+ filesystem.add_option(
+ '--autonumber-start',
+ dest='autonumber_start', metavar='NUMBER', default=1, type=int,
+ help='Specify the start value for %(autonumber)s (default is %default)')
filesystem.add_option(
'--restrict-filenames',
action='store_true', dest='restrictfilenames', default=False,
help='Convert video files to audio-only files (requires ffmpeg or avconv and ffprobe or avprobe)')
postproc.add_option(
'--audio-format', metavar='FORMAT', dest='audioformat', default='best',
- help='Specify audio format: "best", "aac", "vorbis", "mp3", "m4a", "opus", or "wav"; "%default" by default')
+ help='Specify audio format: "best", "aac", "vorbis", "mp3", "m4a", "opus", or "wav"; "%default" by default; No effect without -x')
postproc.add_option(
'--audio-quality', metavar='QUALITY',
dest='audioquality', default='5',
return conf
command_line_conf = compat_conf(sys.argv[1:])
-
- if '--ignore-config' in command_line_conf:
- system_conf = []
- user_conf = []
+ opts, args = parser.parse_args(command_line_conf)
+
+ system_conf = user_conf = custom_conf = []
+
+ if '--config-location' in command_line_conf:
+ location = compat_expanduser(opts.config_location)
+ if os.path.isdir(location):
+ location = os.path.join(location, 'youtube-dl.conf')
+ if not os.path.exists(location):
+ parser.error('config-location %s does not exist.' % location)
+ custom_conf = _readOptions(location)
+ elif '--ignore-config' in command_line_conf:
+ pass
else:
system_conf = _readOptions('/etc/youtube-dl.conf')
- if '--ignore-config' in system_conf:
- user_conf = []
- else:
+ if '--ignore-config' not in system_conf:
user_conf = _readUserConf()
- argv = system_conf + user_conf + command_line_conf
+ argv = system_conf + user_conf + custom_conf + command_line_conf
opts, args = parser.parse_args(argv)
if opts.verbose:
- write_string('[debug] System config: ' + repr(_hide_login_info(system_conf)) + '\n')
- write_string('[debug] User config: ' + repr(_hide_login_info(user_conf)) + '\n')
- write_string('[debug] Command-line args: ' + repr(_hide_login_info(command_line_conf)) + '\n')
+ for conf_label, conf in (
+ ('System config', system_conf),
+ ('User config', user_conf),
+ ('Custom config', custom_conf),
+ ('Command-line args', command_line_conf)):
+ write_string('[debug] %s: %s\n' % (conf_label, repr(_hide_login_info(conf))))
return parser, opts, args
self._titleregex = self.format_to_regex(titleformat)
def format_to_regex(self, fmt):
- """
+ r"""
Converts a string like
'%(title)s - %(artist)s'
to a regex like
ATYP_IPV6 = 0x04
-class ProxyError(IOError):
+class ProxyError(socket.error):
ERR_SUCCESS = 0x00
def __init__(self, code=None, msg=None):
if code is not None and msg is None:
- msg = self.CODES.get(code) and 'unknown error'
+ msg = self.CODES.get(code) or 'unknown error'
super(ProxyError, self).__init__(code, msg)
while len(data) < cnt:
cur = self.recv(cnt - len(data))
if not cur:
- raise IOError('{0} bytes missing'.format(cnt - len(data)))
+ raise EOFError('{0} bytes missing'.format(cnt - len(data)))
data += cur
return data
}
+USER_AGENTS = {
+ 'Safari': 'Mozilla/5.0 (X11; Linux x86_64; rv:10.0) AppleWebKit/533.20.25 (KHTML, like Gecko) Version/5.0.4 Safari/533.20.27',
+}
+
+
NO_DEFAULT = object()
ENGLISH_MONTH_NAMES = [
'%d %B %Y',
'%d %b %Y',
'%B %d %Y',
+ '%B %dst %Y',
+ '%B %dnd %Y',
+ '%B %dth %Y',
'%b %d %Y',
+ '%b %dst %Y',
+ '%b %dnd %Y',
+ '%b %dth %Y',
'%b %dst %Y %I:%M',
'%b %dnd %Y %I:%M',
'%b %dth %Y %I:%M',
'%Y/%m/%d',
'%Y/%m/%d %H:%M',
'%Y/%m/%d %H:%M:%S',
+ '%Y-%m-%d %H:%M',
'%Y-%m-%d %H:%M:%S',
'%Y-%m-%d %H:%M:%S.%f',
'%d.%m.%Y %H:%M',
if drive_or_unc:
norm_path.pop(0)
sanitized_path = [
- path_part if path_part in ['.', '..'] else re.sub('(?:[/<>:"\\|\\\\?\\*]|[\s.]$)', '#', path_part)
+ path_part if path_part in ['.', '..'] else re.sub(r'(?:[/<>:"\|\\?\*]|[\s.]$)', '#', path_part)
for path_part in norm_path]
if drive_or_unc:
sanitized_path.insert(0, drive_or_unc + os.path.sep)
return today
if date_str == 'yesterday':
return today - datetime.timedelta(days=1)
- match = re.match('(now|today)(?P<sign>[+-])(?P<time>\d+)(?P<unit>day|week|month|year)(s)?', date_str)
+ match = re.match(r'(now|today)(?P<sign>[+-])(?P<time>\d+)(?P<unit>day|week|month|year)(s)?', date_str)
if match is not None:
sign = match.group('sign')
time = int(match.group('time'))
return re.match(r'https?://[^?#&]+/', url).group()
+def urljoin(base, path):
+ if not isinstance(path, compat_str) or not path:
+ return None
+ if re.match(r'^(?:https?:)?//', path):
+ return path
+ if not isinstance(base, compat_str) or not re.match(r'^(?:https?:)?//', base):
+ return None
+ return compat_urlparse.urljoin(base, path)
+
+
class HEADRequest(compat_urllib_request.Request):
def get_method(self):
return 'HEAD'
s = s.strip()
days, hours, mins, secs, ms = [None] * 5
- m = re.match(r'(?:(?:(?:(?P<days>[0-9]+):)?(?P<hours>[0-9]+):)?(?P<mins>[0-9]+):)?(?P<secs>[0-9]+)(?P<ms>\.[0-9]+)?$', s)
+ m = re.match(r'(?:(?:(?:(?P<days>[0-9]+):)?(?P<hours>[0-9]+):)?(?P<mins>[0-9]+):)?(?P<secs>[0-9]+)(?P<ms>\.[0-9]+)?Z?$', s)
if m:
days, hours, mins, secs, ms = m.groups()
else:
)?
(?:
(?P<secs>[0-9]+)(?P<ms>\.[0-9]+)?\s*s(?:ec(?:ond)?s?)?\s*
- )?$''', s)
+ )?Z?$''', s)
if m:
days, hours, mins, secs, ms = m.groups()
else:
- m = re.match(r'(?i)(?:(?P<hours>[0-9.]+)\s*(?:hours?)|(?P<mins>[0-9.]+)\s*(?:mins?\.?|minutes?)\s*)$', s)
+ m = re.match(r'(?i)(?:(?P<hours>[0-9.]+)\s*(?:hours?)|(?P<mins>[0-9.]+)\s*(?:mins?\.?|minutes?)\s*)Z?$', s)
if m:
hours, mins = m.groups()
else:
def js_to_json(code):
+ COMMENT_RE = r'/\*(?:(?!\*/).)*?\*/|//[^\n]*'
+ SKIP_RE = r'\s*(?:{comment})?\s*'.format(comment=COMMENT_RE)
+ INTEGER_TABLE = (
+ (r'(?s)^(0[xX][0-9a-fA-F]+){skip}:?$'.format(skip=SKIP_RE), 16),
+ (r'(?s)^(0+[0-7]+){skip}:?$'.format(skip=SKIP_RE), 8),
+ )
+
def fix_kv(m):
v = m.group(0)
if v in ('true', 'false', 'null'):
return v
- elif v.startswith('/*') or v == ',':
+ elif v.startswith('/*') or v.startswith('//') or v == ',':
return ""
if v[0] in ("'", '"'):
'\\x': '\\u00',
}.get(m.group(0), m.group(0)), v[1:-1])
- INTEGER_TABLE = (
- (r'^(0[xX][0-9a-fA-F]+)\s*:?$', 16),
- (r'^(0+[0-7]+)\s*:?$', 8),
- )
-
for regex, base in INTEGER_TABLE:
im = re.match(regex, v)
if im:
return re.sub(r'''(?sx)
"(?:[^"\\]*(?:\\\\|\\['"nurtbfx/\n]))*[^"\\]*"|
'(?:[^'\\]*(?:\\\\|\\['"nurtbfx/\n]))*[^'\\]*'|
- /\*.*?\*/|,(?=\s*[\]}])|
+ {comment}|,(?={skip}[\]}}])|
[a-zA-Z_][.a-zA-Z_0-9]*|
- \b(?:0[xX][0-9a-fA-F]+|0+[0-7]+)(?:\s*:)?|
- [0-9]+(?=\s*:)
- ''', fix_kv, code)
+ \b(?:0[xX][0-9a-fA-F]+|0+[0-7]+)(?:{skip}:)?|
+ [0-9]+(?={skip}:)
+ '''.format(comment=COMMENT_RE, skip=SKIP_RE), fix_kv, code)
def qualities(quality_ids):
from __future__ import unicode_literals
-__version__ = '2016.12.01'
+__version__ = '2017.02.07'