[cleanup, docs] Misc cleanup

Closes #2828, closes #2734, closes #2802, closes #2937
This commit is contained in:
pukkandan 2022-03-04 19:38:55 +05:30
parent c89bec262c
commit 08d30158ec
No known key found for this signature in database
GPG key ID: 7EEE9E1E817D0A39
20 changed files with 114 additions and 87 deletions

2
.gitignore vendored
View file

@ -24,6 +24,7 @@ cookies
*.3gp *.3gp
*.ape *.ape
*.ass
*.avi *.avi
*.desktop *.desktop
*.flac *.flac
@ -106,6 +107,7 @@ yt-dlp.zip
*.iml *.iml
.vscode .vscode
*.sublime-* *.sublime-*
*.code-workspace
# Lazy extractors # Lazy extractors
*/extractor/lazy_extractors.py */extractor/lazy_extractors.py

View file

@ -11,6 +11,7 @@
- [Is anyone going to need the feature?](#is-anyone-going-to-need-the-feature) - [Is anyone going to need the feature?](#is-anyone-going-to-need-the-feature)
- [Is your question about yt-dlp?](#is-your-question-about-yt-dlp) - [Is your question about yt-dlp?](#is-your-question-about-yt-dlp)
- [Are you willing to share account details if needed?](#are-you-willing-to-share-account-details-if-needed) - [Are you willing to share account details if needed?](#are-you-willing-to-share-account-details-if-needed)
- [Is the website primarily used for piracy](#is-the-website-primarily-used-for-piracy)
- [DEVELOPER INSTRUCTIONS](#developer-instructions) - [DEVELOPER INSTRUCTIONS](#developer-instructions)
- [Adding new feature or making overarching changes](#adding-new-feature-or-making-overarching-changes) - [Adding new feature or making overarching changes](#adding-new-feature-or-making-overarching-changes)
- [Adding support for a new site](#adding-support-for-a-new-site) - [Adding support for a new site](#adding-support-for-a-new-site)
@ -24,6 +25,7 @@
- [Collapse fallbacks](#collapse-fallbacks) - [Collapse fallbacks](#collapse-fallbacks)
- [Trailing parentheses](#trailing-parentheses) - [Trailing parentheses](#trailing-parentheses)
- [Use convenience conversion and parsing functions](#use-convenience-conversion-and-parsing-functions) - [Use convenience conversion and parsing functions](#use-convenience-conversion-and-parsing-functions)
- [My pull request is labeled pending-fixes](#my-pull-request-is-labeled-pending-fixes)
- [EMBEDDING YT-DLP](README.md#embedding-yt-dlp) - [EMBEDDING YT-DLP](README.md#embedding-yt-dlp)
@ -123,6 +125,10 @@ While these steps won't necessarily ensure that no misuse of the account takes p
- Change the password before sharing the account to something random (use [this](https://passwordsgenerator.net/) if you don't have a random password generator). - Change the password before sharing the account to something random (use [this](https://passwordsgenerator.net/) if you don't have a random password generator).
- Change the password after receiving the account back. - Change the password after receiving the account back.
### Is the website primarily used for piracy?
We follow [youtube-dl's policy](https://github.com/ytdl-org/youtube-dl#can-you-add-support-for-this-anime-video-site-or-site-which-shows-current-movies-for-free) to not support services that is primarily used for infringing copyright. Additionally, it has been decided to not to support porn sites that specialize in deep fake. We also cannot support any service that serves only [DRM protected content](https://en.wikipedia.org/wiki/Digital_rights_management).
@ -210,7 +216,7 @@ After you have ensured this site is distributing its content legally, you can fo
} }
``` ```
1. Add an import in [`yt_dlp/extractor/extractors.py`](yt_dlp/extractor/extractors.py). 1. Add an import in [`yt_dlp/extractor/extractors.py`](yt_dlp/extractor/extractors.py).
1. Run `python test/test_download.py TestDownload.test_YourExtractor`. This *should fail* at first, but you can continually re-run it until you're done. If you decide to add more than one test, the tests will then be named `TestDownload.test_YourExtractor`, `TestDownload.test_YourExtractor_1`, `TestDownload.test_YourExtractor_2`, etc. Note that tests with `only_matching` key in test's dict are not counted in. You can also run all the tests in one go with `TestDownload.test_YourExtractor_all` 1. Run `python test/test_download.py TestDownload.test_YourExtractor` (note that `YourExtractor` doesn't end with `IE`). This *should fail* at first, but you can continually re-run it until you're done. If you decide to add more than one test, the tests will then be named `TestDownload.test_YourExtractor`, `TestDownload.test_YourExtractor_1`, `TestDownload.test_YourExtractor_2`, etc. Note that tests with `only_matching` key in test's dict are not counted in. You can also run all the tests in one go with `TestDownload.test_YourExtractor_all`
1. Make sure you have atleast one test for your extractor. Even if all videos covered by the extractor are expected to be inaccessible for automated testing, tests should still be added with a `skip` parameter indicating why the particular test is disabled from running. 1. Make sure you have atleast one test for your extractor. Even if all videos covered by the extractor are expected to be inaccessible for automated testing, tests should still be added with a `skip` parameter indicating why the particular test is disabled from running.
1. Have a look at [`yt_dlp/extractor/common.py`](yt_dlp/extractor/common.py) for possible helper methods and a [detailed description of what your extractor should and may return](yt_dlp/extractor/common.py#L91-L426). Add tests and code for as many as you want. 1. Have a look at [`yt_dlp/extractor/common.py`](yt_dlp/extractor/common.py) for possible helper methods and a [detailed description of what your extractor should and may return](yt_dlp/extractor/common.py#L91-L426). Add tests and code for as many as you want.
1. Make sure your code follows [yt-dlp coding conventions](#yt-dlp-coding-conventions) and check the code with [flake8](https://flake8.pycqa.org/en/latest/index.html#quickstart): 1. Make sure your code follows [yt-dlp coding conventions](#yt-dlp-coding-conventions) and check the code with [flake8](https://flake8.pycqa.org/en/latest/index.html#quickstart):
@ -658,6 +664,10 @@ duration = float_or_none(video.get('durationMs'), scale=1000)
view_count = int_or_none(video.get('views')) view_count = int_or_none(video.get('views'))
``` ```
# My pull request is labeled pending-fixes
The `pending-fixes` label is added when there are changes requested to a PR. When the necessary changes are made, the label should be removed. However, despite our best efforts, it may sometimes happen that the maintainer did not see the changes or forgot to remove the label. If your PR is still marked as `pending-fixes` a few days after all requested changes have been made, feel free to ping the maintainer who labeled your issue and ask them to re-review and remove the label.

View file

@ -146,7 +146,7 @@ chio0hai
cntrl-s cntrl-s
Deer-Spangle Deer-Spangle
DEvmIb DEvmIb
Grabien Grabien/MaximVol
j54vc1bk j54vc1bk
mpeter50 mpeter50
mrpapersonic mrpapersonic
@ -160,7 +160,7 @@ PilzAdam
zmousm zmousm
iw0nderhow iw0nderhow
unit193 unit193
TwoThousandHedgehogs TwoThousandHedgehogs/KathrynElrod
Jertzukka Jertzukka
cypheron cypheron
Hyeeji Hyeeji

View file

@ -16,7 +16,7 @@ pypi-files: AUTHORS Changelog.md LICENSE README.md README.txt supportedsites com
clean-test: clean-test:
rm -rf test/testdata/sigs/player-*.js tmp/ *.annotations.xml *.aria2 *.description *.dump *.frag \ rm -rf test/testdata/sigs/player-*.js tmp/ *.annotations.xml *.aria2 *.description *.dump *.frag \
*.frag.aria2 *.frag.urls *.info.json *.live_chat.json *.meta *.part* *.tmp *.temp *.unknown_video *.ytdl \ *.frag.aria2 *.frag.urls *.info.json *.live_chat.json *.meta *.part* *.tmp *.temp *.unknown_video *.ytdl \
*.3gp *.ape *.avi *.desktop *.flac *.flv *.jpeg *.jpg *.m4a *.m4v *.mhtml *.mkv *.mov *.mp3 \ *.3gp *.ape *.ass *.avi *.desktop *.flac *.flv *.jpeg *.jpg *.m4a *.m4v *.mhtml *.mkv *.mov *.mp3 \
*.mp4 *.ogg *.opus *.png *.sbv *.srt *.swf *.swp *.ttml *.url *.vtt *.wav *.webloc *.webm *.webp *.mp4 *.ogg *.opus *.png *.sbv *.srt *.swf *.swp *.ttml *.url *.vtt *.wav *.webloc *.webm *.webp
clean-dist: clean-dist:
rm -rf yt-dlp.1.temp.md yt-dlp.1 README.txt MANIFEST build/ dist/ .coverage cover/ yt-dlp.tar.gz completions/ \ rm -rf yt-dlp.1.temp.md yt-dlp.1 README.txt MANIFEST build/ dist/ .coverage cover/ yt-dlp.tar.gz completions/ \

View file

@ -112,7 +112,7 @@ yt-dlp is a [youtube-dl](https://github.com/ytdl-org/youtube-dl) fork based on t
* **Other new options**: Many new options have been added such as `--concat-playlist`, `--print`, `--wait-for-video`, `--sleep-requests`, `--convert-thumbnails`, `--write-link`, `--force-download-archive`, `--force-overwrites`, `--break-on-reject` etc * **Other new options**: Many new options have been added such as `--concat-playlist`, `--print`, `--wait-for-video`, `--sleep-requests`, `--convert-thumbnails`, `--write-link`, `--force-download-archive`, `--force-overwrites`, `--break-on-reject` etc
* **Improvements**: Regex and other operators in `--match-filter`, multiple `--postprocessor-args` and `--downloader-args`, faster archive checking, more [format selection options](#format-selection), merge multi-video/audio, multiple `--config-locations`, `--exec` at different stages, etc * **Improvements**: Regex and other operators in `--format`/`--match-filter`, multiple `--postprocessor-args` and `--downloader-args`, faster archive checking, more [format selection options](#format-selection), merge multi-video/audio, multiple `--config-locations`, `--exec` at different stages, etc
* **Plugins**: Extractors and PostProcessors can be loaded from an external file. See [plugins](#plugins) for details * **Plugins**: Extractors and PostProcessors can be loaded from an external file. See [plugins](#plugins) for details
@ -130,7 +130,7 @@ Some of yt-dlp's default options are different from that of youtube-dl and youtu
* The default [format sorting](#sorting-formats) is different from youtube-dl and prefers higher resolution and better codecs rather than higher bitrates. You can use the `--format-sort` option to change this to any order you prefer, or use `--compat-options format-sort` to use youtube-dl's sorting order * The default [format sorting](#sorting-formats) is different from youtube-dl and prefers higher resolution and better codecs rather than higher bitrates. You can use the `--format-sort` option to change this to any order you prefer, or use `--compat-options format-sort` to use youtube-dl's sorting order
* The default format selector is `bv*+ba/b`. This means that if a combined video + audio format that is better than the best video-only format is found, the former will be preferred. Use `-f bv+ba/b` or `--compat-options format-spec` to revert this * The default format selector is `bv*+ba/b`. This means that if a combined video + audio format that is better than the best video-only format is found, the former will be preferred. Use `-f bv+ba/b` or `--compat-options format-spec` to revert this
* Unlike youtube-dlc, yt-dlp does not allow merging multiple audio/video streams into one file by default (since this conflicts with the use of `-f bv*+ba`). If needed, this feature must be enabled using `--audio-multistreams` and `--video-multistreams`. You can also use `--compat-options multistreams` to enable both * Unlike youtube-dlc, yt-dlp does not allow merging multiple audio/video streams into one file by default (since this conflicts with the use of `-f bv*+ba`). If needed, this feature must be enabled using `--audio-multistreams` and `--video-multistreams`. You can also use `--compat-options multistreams` to enable both
* `--ignore-errors` is enabled by default. Use `--abort-on-error` or `--compat-options abort-on-error` to abort on errors instead * `--no-abort-on-error` is enabled by default. Use `--abort-on-error` or `--compat-options abort-on-error` to abort on errors instead
* When writing metadata files such as thumbnails, description or infojson, the same information (if available) is also written for playlists. Use `--no-write-playlist-metafiles` or `--compat-options no-playlist-metafiles` to not write these files * When writing metadata files such as thumbnails, description or infojson, the same information (if available) is also written for playlists. Use `--no-write-playlist-metafiles` or `--compat-options no-playlist-metafiles` to not write these files
* `--add-metadata` attaches the `infojson` to `mkv` files in addition to writing the metadata when used with `--write-info-json`. Use `--no-embed-info-json` or `--compat-options no-attach-info-json` to revert this * `--add-metadata` attaches the `infojson` to `mkv` files in addition to writing the metadata when used with `--write-info-json`. Use `--no-embed-info-json` or `--compat-options no-attach-info-json` to revert this
* Some metadata are embedded into different fields when using `--add-metadata` as compared to youtube-dl. Most notably, `comment` field contains the `webpage_url` and `synopsis` contains the `description`. You can [use `--parse-metadata`](#modifying-metadata) to modify this to your liking or use `--compat-options embed-metadata` to revert this * Some metadata are embedded into different fields when using `--add-metadata` as compared to youtube-dl. Most notably, `comment` field contains the `webpage_url` and `synopsis` contains the `description`. You can [use `--parse-metadata`](#modifying-metadata) to modify this to your liking or use `--compat-options embed-metadata` to revert this
@ -267,7 +267,7 @@ While all the other dependencies are optional, `ffmpeg` and `ffprobe` are highly
* [**pycryptodomex**](https://github.com/Legrandin/pycryptodome) - For decrypting AES-128 HLS streams and various other data. Licensed under [BSD2](https://github.com/Legrandin/pycryptodome/blob/master/LICENSE.rst) * [**pycryptodomex**](https://github.com/Legrandin/pycryptodome) - For decrypting AES-128 HLS streams and various other data. Licensed under [BSD2](https://github.com/Legrandin/pycryptodome/blob/master/LICENSE.rst)
* [**websockets**](https://github.com/aaugustin/websockets) - For downloading over websocket. Licensed under [BSD3](https://github.com/aaugustin/websockets/blob/main/LICENSE) * [**websockets**](https://github.com/aaugustin/websockets) - For downloading over websocket. Licensed under [BSD3](https://github.com/aaugustin/websockets/blob/main/LICENSE)
* [**secretstorage**](https://github.com/mitya57/secretstorage) - For accessing the Gnome keyring while decrypting cookies of Chromium-based browsers on Linux. Licensed under [BSD](https://github.com/mitya57/secretstorage/blob/master/LICENSE) * [**secretstorage**](https://github.com/mitya57/secretstorage) - For accessing the Gnome keyring while decrypting cookies of Chromium-based browsers on Linux. Licensed under [BSD](https://github.com/mitya57/secretstorage/blob/master/LICENSE)
* [**AtomicParsley**](https://github.com/wez/atomicparsley) - For embedding thumbnail in mp4/m4a if mutagen is not present. Licensed under [GPLv2+](https://github.com/wez/atomicparsley/blob/master/COPYING) * [**AtomicParsley**](https://github.com/wez/atomicparsley) - For embedding thumbnail in mp4/m4a if mutagen/ffmpeg cannot. Licensed under [GPLv2+](https://github.com/wez/atomicparsley/blob/master/COPYING)
* [**brotli**](https://github.com/google/brotli) or [**brotlicffi**](https://github.com/python-hyper/brotlicffi) - [Brotli](https://en.wikipedia.org/wiki/Brotli) content encoding support. Both licensed under MIT <sup>[1](https://github.com/google/brotli/blob/master/LICENSE) [2](https://github.com/python-hyper/brotlicffi/blob/master/LICENSE) </sup> * [**brotli**](https://github.com/google/brotli) or [**brotlicffi**](https://github.com/python-hyper/brotlicffi) - [Brotli](https://en.wikipedia.org/wiki/Brotli) content encoding support. Both licensed under MIT <sup>[1](https://github.com/google/brotli/blob/master/LICENSE) [2](https://github.com/python-hyper/brotlicffi/blob/master/LICENSE) </sup>
* [**rtmpdump**](http://rtmpdump.mplayerhq.hu) - For downloading `rtmp` streams. ffmpeg will be used as a fallback. Licensed under [GPLv2+](http://rtmpdump.mplayerhq.hu) * [**rtmpdump**](http://rtmpdump.mplayerhq.hu) - For downloading `rtmp` streams. ffmpeg will be used as a fallback. Licensed under [GPLv2+](http://rtmpdump.mplayerhq.hu)
* [**mplayer**](http://mplayerhq.hu/design7/info.html) or [**mpv**](https://mpv.io) - For downloading `rstp` streams. ffmpeg will be used as a fallback. Licensed under [GPLv2+](https://github.com/mpv-player/mpv/blob/master/Copyright) * [**mplayer**](http://mplayerhq.hu/design7/info.html) or [**mpv**](https://mpv.io) - For downloading `rstp` streams. ffmpeg will be used as a fallback. Licensed under [GPLv2+](https://github.com/mpv-player/mpv/blob/master/Copyright)
@ -279,6 +279,7 @@ To use or redistribute the dependencies, you must agree to their respective lice
The Windows and MacOS standalone release binaries are already built with the python interpreter, mutagen, pycryptodomex and websockets included. The Windows and MacOS standalone release binaries are already built with the python interpreter, mutagen, pycryptodomex and websockets included.
<!-- TODO: ffmpeg has merged this patch. Remove this note once there is new release -->
**Note**: There are some regressions in newer ffmpeg versions that causes various issues when used alongside yt-dlp. Since ffmpeg is such an important dependency, we provide [custom builds](https://github.com/yt-dlp/FFmpeg-Builds#ffmpeg-static-auto-builds) with patches for these issues at [yt-dlp/FFmpeg-Builds](https://github.com/yt-dlp/FFmpeg-Builds). See [the readme](https://github.com/yt-dlp/FFmpeg-Builds#patches-applied) for details on the specific issues solved by these builds **Note**: There are some regressions in newer ffmpeg versions that causes various issues when used alongside yt-dlp. Since ffmpeg is such an important dependency, we provide [custom builds](https://github.com/yt-dlp/FFmpeg-Builds#ffmpeg-static-auto-builds) with patches for these issues at [yt-dlp/FFmpeg-Builds](https://github.com/yt-dlp/FFmpeg-Builds). See [the readme](https://github.com/yt-dlp/FFmpeg-Builds#patches-applied) for details on the specific issues solved by these builds
@ -606,11 +607,11 @@ You can also fork the project on github and run your fork's [build workflow](.gi
--write-description etc. (default) --write-description etc. (default)
--no-write-playlist-metafiles Do not write playlist metadata when using --no-write-playlist-metafiles Do not write playlist metadata when using
--write-info-json, --write-description etc. --write-info-json, --write-description etc.
--clean-infojson Remove some private fields such as --clean-info-json Remove some private fields such as
filenames from the infojson. Note that it filenames from the infojson. Note that it
could still contain some personal could still contain some personal
information (default) information (default)
--no-clean-infojson Write all fields to the infojson --no-clean-info-json Write all fields to the infojson
--write-comments Retrieve video comments to be placed in the --write-comments Retrieve video comments to be placed in the
infojson. The comments are fetched even infojson. The comments are fetched even
without this option if the extraction is without this option if the extraction is
@ -1599,25 +1600,28 @@ This option also has a few special uses:
* You can download an additional URL based on the metadata of the currently downloaded video. To do this, set the field `additional_urls` to the URL that you want to download. Eg: `--parse-metadata "description:(?P<additional_urls>https?://www\.vimeo\.com/\d+)` will download the first vimeo video found in the description * You can download an additional URL based on the metadata of the currently downloaded video. To do this, set the field `additional_urls` to the URL that you want to download. Eg: `--parse-metadata "description:(?P<additional_urls>https?://www\.vimeo\.com/\d+)` will download the first vimeo video found in the description
* You can use this to change the metadata that is embedded in the media file. To do this, set the value of the corresponding field with a `meta_` prefix. For example, any value you set to `meta_description` field will be added to the `description` field in the file. For example, you can use this to set a different "description" and "synopsis". To modify the metadata of individual streams, use the `meta<n>_` prefix (Eg: `meta1_language`). Any value set to the `meta_` field will overwrite all default values. * You can use this to change the metadata that is embedded in the media file. To do this, set the value of the corresponding field with a `meta_` prefix. For example, any value you set to `meta_description` field will be added to the `description` field in the file. For example, you can use this to set a different "description" and "synopsis". To modify the metadata of individual streams, use the `meta<n>_` prefix (Eg: `meta1_language`). Any value set to the `meta_` field will overwrite all default values.
**Note**: Metadata modification happens before format selection, post-extraction and other post-processing operations. Some fields may be added or changed during these steps, overriding your changes.
For reference, these are the fields yt-dlp adds by default to the file metadata: For reference, these are the fields yt-dlp adds by default to the file metadata:
Metadata fields|From Metadata fields | From
:---|:--- :--------------------------|:------------------------------------------------
`title`|`track` or `title` `title` | `track` or `title`
`date`|`upload_date` `date` | `upload_date`
`description`, `synopsis`|`description` `description`, `synopsis` | `description`
`purl`, `comment`|`webpage_url` `purl`, `comment` | `webpage_url`
`track`|`track_number` `track` | `track_number`
`artist`|`artist`, `creator`, `uploader` or `uploader_id` `artist` | `artist`, `creator`, `uploader` or `uploader_id`
`genre`|`genre` `genre` | `genre`
`album`|`album` `album` | `album`
`album_artist`|`album_artist` `album_artist` | `album_artist`
`disc`|`disc_number` `disc` | `disc_number`
`show`|`series` `show` | `series`
`season_number`|`season_number` `season_number` | `season_number`
`episode_id`|`episode` or `episode_id` `episode_id` | `episode` or `episode_id`
`episode_sort`|`episode_number` `episode_sort` | `episode_number`
`language` of each stream|From the format's `language` `language` of each stream | the format's `language`
**Note**: The file format may not support some of these fields **Note**: The file format may not support some of these fields
@ -1816,12 +1820,11 @@ ydl_opts = {
}], }],
'logger': MyLogger(), 'logger': MyLogger(),
'progress_hooks': [my_hook], 'progress_hooks': [my_hook],
# Add custom headers
'http_headers': {'Referer': 'https://www.google.com'}
} }
# Add custom headers
yt_dlp.utils.std_headers.update({'Referer': 'https://www.google.com'})
# See the public functions in yt_dlp.YoutubeDL for for other available functions. # See the public functions in yt_dlp.YoutubeDL for for other available functions.
# Eg: "ydl.download", "ydl.download_with_info_file" # Eg: "ydl.download", "ydl.download_with_info_file"
with yt_dlp.YoutubeDL(ydl_opts) as ydl: with yt_dlp.YoutubeDL(ydl_opts) as ydl:

View file

@ -75,7 +75,11 @@ def filter_options(readme):
section = re.search(r'(?sm)^# USAGE AND OPTIONS\n.+?(?=^# )', readme).group(0) section = re.search(r'(?sm)^# USAGE AND OPTIONS\n.+?(?=^# )', readme).group(0)
options = '# OPTIONS\n' options = '# OPTIONS\n'
for line in section.split('\n')[1:]: for line in section.split('\n')[1:]:
mobj = re.fullmatch(r'\s{4}(?P<opt>-(?:,\s|[^\s])+)(?:\s(?P<meta>([^\s]|\s(?!\s))+))?(\s{2,}(?P<desc>.+))?', line) mobj = re.fullmatch(r'''(?x)
\s{4}(?P<opt>-(?:,\s|[^\s])+)
(?:\s(?P<meta>(?:[^\s]|\s(?!\s))+))?
(\s{2,}(?P<desc>.+))?
''', line)
if not mobj: if not mobj:
options += f'{line.lstrip()}\n' options += f'{line.lstrip()}\n'
continue continue

View file

@ -21,7 +21,7 @@ DESCRIPTION = 'A youtube-dl fork with additional features and patches'
LONG_DESCRIPTION = '\n\n'.join(( LONG_DESCRIPTION = '\n\n'.join((
'Official repository: <https://github.com/yt-dlp/yt-dlp>', 'Official repository: <https://github.com/yt-dlp/yt-dlp>',
'**PS**: Some links in this document will not work since this is a copy of the README.md from Github', '**PS**: Some links in this document will not work since this is a copy of the README.md from Github',
open('README.md', 'r', encoding='utf-8').read())) open('README.md').read()))
REQUIREMENTS = open('requirements.txt').read().splitlines() REQUIREMENTS = open('requirements.txt').read().splitlines()

View file

@ -235,6 +235,8 @@ class YoutubeDL(object):
See "Sorting Formats" for more details. See "Sorting Formats" for more details.
format_sort_force: Force the given format_sort. see "Sorting Formats" format_sort_force: Force the given format_sort. see "Sorting Formats"
for more details. for more details.
prefer_free_formats: Whether to prefer video formats with free containers
over non-free ones of same quality.
allow_multiple_video_streams: Allow multiple video streams to be merged allow_multiple_video_streams: Allow multiple video streams to be merged
into a single file into a single file
allow_multiple_audio_streams: Allow multiple audio streams to be merged allow_multiple_audio_streams: Allow multiple audio streams to be merged

View file

@ -22,6 +22,9 @@ class YoutubeLiveChatFD(FragmentFD):
def real_download(self, filename, info_dict): def real_download(self, filename, info_dict):
video_id = info_dict['video_id'] video_id = info_dict['video_id']
self.to_screen('[%s] Downloading live chat' % self.FD_NAME) self.to_screen('[%s] Downloading live chat' % self.FD_NAME)
if not self.params.get('skip_download'):
self.report_warning('Live chat download runs until the livestream ends. '
'If you wish to download the video simultaneously, run a separate yt-dlp instance')
fragment_retries = self.params.get('fragment_retries', 0) fragment_retries = self.params.get('fragment_retries', 0)
test = self.params.get('test', False) test = self.params.get('test', False)

View file

@ -8,10 +8,6 @@ import struct
from base64 import urlsafe_b64encode from base64 import urlsafe_b64encode
from binascii import unhexlify from binascii import unhexlify
import typing
if typing.TYPE_CHECKING:
from ..YoutubeDL import YoutubeDL
from .common import InfoExtractor from .common import InfoExtractor
from ..aes import aes_ecb_decrypt from ..aes import aes_ecb_decrypt
from ..compat import ( from ..compat import (
@ -36,15 +32,15 @@ from ..utils import (
# NOTE: network handler related code is temporary thing until network stack overhaul PRs are merged (#2861/#2862) # NOTE: network handler related code is temporary thing until network stack overhaul PRs are merged (#2861/#2862)
def add_opener(self: 'YoutubeDL', handler): def add_opener(ydl, handler):
''' Add a handler for opening URLs, like _download_webpage ''' ''' Add a handler for opening URLs, like _download_webpage '''
# https://github.com/python/cpython/blob/main/Lib/urllib/request.py#L426 # https://github.com/python/cpython/blob/main/Lib/urllib/request.py#L426
# https://github.com/python/cpython/blob/main/Lib/urllib/request.py#L605 # https://github.com/python/cpython/blob/main/Lib/urllib/request.py#L605
assert isinstance(self._opener, compat_urllib_request.OpenerDirector) assert isinstance(ydl._opener, compat_urllib_request.OpenerDirector)
self._opener.add_handler(handler) ydl._opener.add_handler(handler)
def remove_opener(self: 'YoutubeDL', handler): def remove_opener(ydl, handler):
''' '''
Remove handler(s) for opening URLs Remove handler(s) for opening URLs
@param handler Either handler object itself or handler type. @param handler Either handler object itself or handler type.
@ -52,8 +48,8 @@ def remove_opener(self: 'YoutubeDL', handler):
''' '''
# https://github.com/python/cpython/blob/main/Lib/urllib/request.py#L426 # https://github.com/python/cpython/blob/main/Lib/urllib/request.py#L426
# https://github.com/python/cpython/blob/main/Lib/urllib/request.py#L605 # https://github.com/python/cpython/blob/main/Lib/urllib/request.py#L605
opener = self._opener opener = ydl._opener
assert isinstance(self._opener, compat_urllib_request.OpenerDirector) assert isinstance(ydl._opener, compat_urllib_request.OpenerDirector)
if isinstance(handler, (type, tuple)): if isinstance(handler, (type, tuple)):
find_cp = lambda x: isinstance(x, handler) find_cp = lambda x: isinstance(x, handler)
else: else:

View file

@ -97,8 +97,8 @@ class Ant1NewsGrArticleIE(Ant1NewsGrBaseIE):
embed_urls = list(Ant1NewsGrEmbedIE._extract_urls(webpage)) embed_urls = list(Ant1NewsGrEmbedIE._extract_urls(webpage))
if not embed_urls: if not embed_urls:
raise ExtractorError('no videos found for %s' % video_id, expected=True) raise ExtractorError('no videos found for %s' % video_id, expected=True)
return self.url_result_or_playlist_from_matches( return self.playlist_from_matches(
embed_urls, video_id, info['title'], ie=Ant1NewsGrEmbedIE.ie_key(), embed_urls, video_id, info.get('title'), ie=Ant1NewsGrEmbedIE.ie_key(),
video_kwargs={'url_transparent': True, 'timestamp': info.get('timestamp')}) video_kwargs={'url_transparent': True, 'timestamp': info.get('timestamp')})

View file

@ -226,6 +226,7 @@ class InfoExtractor(object):
The following fields are optional: The following fields are optional:
direct: True if a direct video file was given (must only be set by GenericIE)
alt_title: A secondary title of the video. alt_title: A secondary title of the video.
display_id An alternative identifier for the video, not necessarily display_id An alternative identifier for the video, not necessarily
unique, but available before title. Typically, id is unique, but available before title. Typically, id is
@ -274,7 +275,7 @@ class InfoExtractor(object):
* "url": A URL pointing to the subtitles file * "url": A URL pointing to the subtitles file
It can optionally also have: It can optionally also have:
* "name": Name or description of the subtitles * "name": Name or description of the subtitles
* http_headers: A dictionary of additional HTTP headers * "http_headers": A dictionary of additional HTTP headers
to add to the request. to add to the request.
"ext" will be calculated from URL if missing "ext" will be calculated from URL if missing
automatic_captions: Like 'subtitles'; contains automatically generated automatic_captions: Like 'subtitles'; contains automatically generated
@ -425,8 +426,8 @@ class InfoExtractor(object):
title, description etc. title, description etc.
Subclasses of this one should re-define the _real_initialize() and Subclasses of this should define a _VALID_URL regexp and, re-define the
_real_extract() methods and define a _VALID_URL regexp. _real_extract() and (optionally) _real_initialize() methods.
Probably, they should also be added to the list of extractors. Probably, they should also be added to the list of extractors.
Subclasses may also override suitable() if necessary, but ensure the function Subclasses may also override suitable() if necessary, but ensure the function
@ -661,7 +662,7 @@ class InfoExtractor(object):
return False return False
def set_downloader(self, downloader): def set_downloader(self, downloader):
"""Sets the downloader for this IE.""" """Sets a YoutubeDL instance as the downloader for this IE."""
self._downloader = downloader self._downloader = downloader
def _real_initialize(self): def _real_initialize(self):
@ -670,7 +671,7 @@ class InfoExtractor(object):
def _real_extract(self, url): def _real_extract(self, url):
"""Real extraction process. Redefine in subclasses.""" """Real extraction process. Redefine in subclasses."""
pass raise NotImplementedError('This method must be implemented by subclasses')
@classmethod @classmethod
def ie_key(cls): def ie_key(cls):
@ -1661,31 +1662,31 @@ class InfoExtractor(object):
'format_id': {'type': 'alias', 'field': 'id'}, 'format_id': {'type': 'alias', 'field': 'id'},
'preference': {'type': 'alias', 'field': 'ie_pref'}, 'preference': {'type': 'alias', 'field': 'ie_pref'},
'language_preference': {'type': 'alias', 'field': 'lang'}, 'language_preference': {'type': 'alias', 'field': 'lang'},
'source_preference': {'type': 'alias', 'field': 'source'},
'protocol': {'type': 'alias', 'field': 'proto'},
'filesize_approx': {'type': 'alias', 'field': 'fs_approx'},
# Deprecated # Deprecated
'dimension': {'type': 'alias', 'field': 'res'}, 'dimension': {'type': 'alias', 'field': 'res', 'deprecated': True},
'resolution': {'type': 'alias', 'field': 'res'}, 'resolution': {'type': 'alias', 'field': 'res', 'deprecated': True},
'extension': {'type': 'alias', 'field': 'ext'}, 'extension': {'type': 'alias', 'field': 'ext', 'deprecated': True},
'bitrate': {'type': 'alias', 'field': 'br'}, 'bitrate': {'type': 'alias', 'field': 'br', 'deprecated': True},
'total_bitrate': {'type': 'alias', 'field': 'tbr'}, 'total_bitrate': {'type': 'alias', 'field': 'tbr', 'deprecated': True},
'video_bitrate': {'type': 'alias', 'field': 'vbr'}, 'video_bitrate': {'type': 'alias', 'field': 'vbr', 'deprecated': True},
'audio_bitrate': {'type': 'alias', 'field': 'abr'}, 'audio_bitrate': {'type': 'alias', 'field': 'abr', 'deprecated': True},
'framerate': {'type': 'alias', 'field': 'fps'}, 'framerate': {'type': 'alias', 'field': 'fps', 'deprecated': True},
'protocol': {'type': 'alias', 'field': 'proto'}, 'filesize_estimate': {'type': 'alias', 'field': 'size', 'deprecated': True},
'source_preference': {'type': 'alias', 'field': 'source'}, 'samplerate': {'type': 'alias', 'field': 'asr', 'deprecated': True},
'filesize_approx': {'type': 'alias', 'field': 'fs_approx'}, 'video_ext': {'type': 'alias', 'field': 'vext', 'deprecated': True},
'filesize_estimate': {'type': 'alias', 'field': 'size'}, 'audio_ext': {'type': 'alias', 'field': 'aext', 'deprecated': True},
'samplerate': {'type': 'alias', 'field': 'asr'}, 'video_codec': {'type': 'alias', 'field': 'vcodec', 'deprecated': True},
'video_ext': {'type': 'alias', 'field': 'vext'}, 'audio_codec': {'type': 'alias', 'field': 'acodec', 'deprecated': True},
'audio_ext': {'type': 'alias', 'field': 'aext'}, 'video': {'type': 'alias', 'field': 'hasvid', 'deprecated': True},
'video_codec': {'type': 'alias', 'field': 'vcodec'}, 'has_video': {'type': 'alias', 'field': 'hasvid', 'deprecated': True},
'audio_codec': {'type': 'alias', 'field': 'acodec'}, 'audio': {'type': 'alias', 'field': 'hasaud', 'deprecated': True},
'video': {'type': 'alias', 'field': 'hasvid'}, 'has_audio': {'type': 'alias', 'field': 'hasaud', 'deprecated': True},
'has_video': {'type': 'alias', 'field': 'hasvid'}, 'extractor': {'type': 'alias', 'field': 'ie_pref', 'deprecated': True},
'audio': {'type': 'alias', 'field': 'hasaud'}, 'extractor_preference': {'type': 'alias', 'field': 'ie_pref', 'deprecated': True},
'has_audio': {'type': 'alias', 'field': 'hasaud'},
'extractor': {'type': 'alias', 'field': 'ie_pref'},
'extractor_preference': {'type': 'alias', 'field': 'ie_pref'},
} }
def __init__(self, ie, field_preference): def __init__(self, ie, field_preference):
@ -1785,7 +1786,7 @@ class InfoExtractor(object):
continue continue
if self._get_field_setting(field, 'type') == 'alias': if self._get_field_setting(field, 'type') == 'alias':
alias, field = field, self._get_field_setting(field, 'field') alias, field = field, self._get_field_setting(field, 'field')
if alias not in ('format_id', 'preference', 'language_preference'): if self._get_field_setting(alias, 'deprecated'):
self.ydl.deprecation_warning( self.ydl.deprecation_warning(
f'Format sorting alias {alias} is deprecated ' f'Format sorting alias {alias} is deprecated '
f'and may be removed in a future version. Please use {field} instead') f'and may be removed in a future version. Please use {field} instead')

View file

@ -252,9 +252,9 @@ class FrontendMastersCourseIE(FrontendMastersPageBaseIE):
entries = [] entries = []
for lesson in lessons: for lesson in lessons:
lesson_name = lesson.get('slug') lesson_name = lesson.get('slug')
if not lesson_name:
continue
lesson_id = lesson.get('hash') or lesson.get('statsId') lesson_id = lesson.get('hash') or lesson.get('statsId')
if not lesson_id or not lesson_name:
continue
entries.append(self._extract_lesson(chapters, lesson_id, lesson)) entries.append(self._extract_lesson(chapters, lesson_id, lesson))
title = course.get('title') title = course.get('title')

View file

@ -621,7 +621,7 @@ class IqIE(InfoExtractor):
preview_time = traverse_obj( preview_time = traverse_obj(
initial_format_data, ('boss_ts', (None, 'data'), ('previewTime', 'rtime')), expected_type=float_or_none, get_all=False) initial_format_data, ('boss_ts', (None, 'data'), ('previewTime', 'rtime')), expected_type=float_or_none, get_all=False)
if traverse_obj(initial_format_data, ('boss_ts', 'data', 'prv'), expected_type=int_or_none): if traverse_obj(initial_format_data, ('boss_ts', 'data', 'prv'), expected_type=int_or_none):
self.report_warning('This preview video is limited%s' % format_field(preview_time, template='to %s seconds')) self.report_warning('This preview video is limited%s' % format_field(preview_time, template=' to %s seconds'))
# TODO: Extract audio-only formats # TODO: Extract audio-only formats
for bid in set(traverse_obj(initial_format_data, ('program', 'video', ..., 'bid'), expected_type=str_or_none, default=[])): for bid in set(traverse_obj(initial_format_data, ('program', 'video', ..., 'bid'), expected_type=str_or_none, default=[])):

View file

@ -33,7 +33,7 @@ class PeriscopeBaseIE(InfoExtractor):
return { return {
'id': broadcast.get('id') or video_id, 'id': broadcast.get('id') or video_id,
'title': self._live_title(title) if is_live else title, 'title': title,
'timestamp': parse_iso8601(broadcast.get('created_at')), 'timestamp': parse_iso8601(broadcast.get('created_at')),
'uploader': uploader, 'uploader': uploader,
'uploader_id': broadcast.get('user_id') or broadcast.get('username'), 'uploader_id': broadcast.get('user_id') or broadcast.get('username'),

View file

@ -59,8 +59,16 @@ class SoundcloudEmbedIE(InfoExtractor):
class SoundcloudBaseIE(InfoExtractor): class SoundcloudBaseIE(InfoExtractor):
_NETRC_MACHINE = 'soundcloud'
_API_V2_BASE = 'https://api-v2.soundcloud.com/' _API_V2_BASE = 'https://api-v2.soundcloud.com/'
_BASE_URL = 'https://soundcloud.com/' _BASE_URL = 'https://soundcloud.com/'
_USER_AGENT = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.105 Safari/537.36'
_API_AUTH_QUERY_TEMPLATE = '?client_id=%s'
_API_AUTH_URL_PW = 'https://api-auth.soundcloud.com/web-auth/sign-in/password%s'
_API_VERIFY_AUTH_TOKEN = 'https://api-auth.soundcloud.com/connect/session%s'
_access_token = None
_HEADERS = {}
def _store_client_id(self, client_id): def _store_client_id(self, client_id):
self._downloader.cache.store('soundcloud', 'client_id', client_id) self._downloader.cache.store('soundcloud', 'client_id', client_id)
@ -103,14 +111,6 @@ class SoundcloudBaseIE(InfoExtractor):
self._CLIENT_ID = self._downloader.cache.load('soundcloud', 'client_id') or 'a3e059563d7fd3372b49b37f00a00bcf' self._CLIENT_ID = self._downloader.cache.load('soundcloud', 'client_id') or 'a3e059563d7fd3372b49b37f00a00bcf'
self._login() self._login()
_USER_AGENT = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.105 Safari/537.36'
_API_AUTH_QUERY_TEMPLATE = '?client_id=%s'
_API_AUTH_URL_PW = 'https://api-auth.soundcloud.com/web-auth/sign-in/password%s'
_API_VERIFY_AUTH_TOKEN = 'https://api-auth.soundcloud.com/connect/session%s'
_access_token = None
_HEADERS = {}
_NETRC_MACHINE = 'soundcloud'
def _login(self): def _login(self):
username, password = self._get_login_info() username, password = self._get_login_info()
if username is None: if username is None:

View file

@ -67,6 +67,7 @@ class SovietsClosetIE(SovietsClosetBaseIE):
'series': 'The Witcher', 'series': 'The Witcher',
'season': 'Misc', 'season': 'Misc',
'episode_number': 13, 'episode_number': 13,
'episode': 'Episode 13',
}, },
}, },
{ {
@ -92,6 +93,7 @@ class SovietsClosetIE(SovietsClosetBaseIE):
'series': 'Arma 3', 'series': 'Arma 3',
'season': 'Zeus Games', 'season': 'Zeus Games',
'episode_number': 3, 'episode_number': 3,
'episode': 'Episode 3',
}, },
}, },
] ]

View file

@ -3094,6 +3094,8 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
# Some formats may have much smaller duration than others (possibly damaged during encoding) # Some formats may have much smaller duration than others (possibly damaged during encoding)
# Eg: 2-nOtRESiUc Ref: https://github.com/yt-dlp/yt-dlp/issues/2823 # Eg: 2-nOtRESiUc Ref: https://github.com/yt-dlp/yt-dlp/issues/2823
is_damaged = try_get(fmt, lambda x: float(x['approxDurationMs']) < approx_duration - 10000) is_damaged = try_get(fmt, lambda x: float(x['approxDurationMs']) < approx_duration - 10000)
if is_damaged:
self.report_warning(f'{video_id}: Some formats are possibly damaged. They will be deprioritized', only_once=True)
dct = { dct = {
'asr': int_or_none(fmt.get('audioSampleRate')), 'asr': int_or_none(fmt.get('audioSampleRate')),
'filesize': int_or_none(fmt.get('contentLength')), 'filesize': int_or_none(fmt.get('contentLength')),

View file

@ -149,7 +149,7 @@ class ZingMp3IE(ZingMp3BaseIE):
}, },
}, { }, {
'url': 'https://zingmp3.vn/video-clip/Suong-Hoa-Dua-Loi-K-ICM-RYO/ZO8ZF7C7.html', 'url': 'https://zingmp3.vn/video-clip/Suong-Hoa-Dua-Loi-K-ICM-RYO/ZO8ZF7C7.html',
'md5': 'e9c972b693aa88301ef981c8151c4343', 'md5': 'c7f23d971ac1a4f675456ed13c9b9612',
'info_dict': { 'info_dict': {
'id': 'ZO8ZF7C7', 'id': 'ZO8ZF7C7',
'title': 'Sương Hoa Đưa Lối', 'title': 'Sương Hoa Đưa Lối',
@ -158,6 +158,8 @@ class ZingMp3IE(ZingMp3BaseIE):
'duration': 207, 'duration': 207,
'track': 'Sương Hoa Đưa Lối', 'track': 'Sương Hoa Đưa Lối',
'artist': 'K-ICM, RYO', 'artist': 'K-ICM, RYO',
'album': 'Sương Hoa Đưa Lối (Single)',
'album_artist': 'K-ICM, RYO',
}, },
}, { }, {
'url': 'https://zingmp3.vn/embed/song/ZWZEI76B?start=false', 'url': 'https://zingmp3.vn/embed/song/ZWZEI76B?start=false',

View file

@ -1030,7 +1030,7 @@ def make_HTTPS_handler(params, **kwargs):
def bug_reports_message(before=';'): def bug_reports_message(before=';'):
msg = ('please report this issue on https://github.com/yt-dlp/yt-dlp , ' msg = ('please report this issue on https://github.com/yt-dlp/yt-dlp , '
'filling out the "Broken site" issue template properly. ' 'filling out the "Broken site" issue template properly. '
'Confirm you are on the latest version using -U') 'Confirm you are on the latest version using yt-dlp -U')
before = before.rstrip() before = before.rstrip()
if not before or before.endswith(('.', '!', '?')): if not before or before.endswith(('.', '!', '?')):
@ -5481,5 +5481,5 @@ has_websockets = bool(compat_websockets)
def merge_headers(*dicts): def merge_headers(*dicts):
"""Merge dicts of network headers case insensitively, prioritizing the latter ones""" """Merge dicts of http headers case insensitively, prioritizing the latter ones"""
return {k.capitalize(): v for k, v in itertools.chain.from_iterable(map(dict.items, dicts))} return {k.capitalize(): v for k, v in itertools.chain.from_iterable(map(dict.items, dicts))}