Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add archive to skip downloaded songs #1602

Merged
merged 5 commits into from
Oct 1, 2022
Merged

Conversation

AkaTenshi
Copy link
Contributor

@AkaTenshi AkaTenshi commented Sep 22, 2022

Title

Add an archive file to skip downloaded songs analog to yt-dlp.

Description

After the successful download of a song it's url is put into a cache.
Pre-download every song's url is checked against the cache and skipped if found since it was already downloaded.
The cache is saved to a file after all downloads finished and on the next execution of spotdl loaded from the same file before downloads start.
This way the cache can be kept among executions.

Related Issue

#1597

Motivation and Context

There is already a mechanism to detect and skip already downloaded files.
However is does need additional network requests to the provider before skipping a song which hurts performance.
In addition for the detection to work the filename of the already downloaded file has to be exactly the same as the checked one.
If you download a playlist completely and then download it again later with a changed output-template (e.g. add the year into the name) all song are redownloaded and exist 2 times with different names.
With the archive users have to ability to skip already downloaded songs even if they are not present at the exact location in the filesystem."
Of course the use of this functionality is completly optional.

How Has This Been Tested?

I downloaded my "saved" playlist twice while specifying an archive file.
The first execution downloaded 380/385 songs successfully (5 were not found on youtube) and wrote 380 urls into the archive file.
On the second execution it loaded 380 urls from the archive file into the cache and only tried to download the remaining 5/385 songs.
There is also a new testcase for saving + loading the archive in test/utils/test_archive.py.

Screenshots (if appropriate)

Types of Changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist

  • My code follows the code style of this project
  • My change requires a change to the documentation
  • I have updated the documentation accordingly
  • I have read the CONTRIBUTING document
  • I have read the CORE VALUES document
  • I have added tests to cover my changes
  • All new and existing tests passed

@xnetcat xnetcat self-requested a review September 23, 2022 08:23
Copy link
Member

@xnetcat xnetcat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good but maybe you could move the archive check from download.py to downloader.

Also maybe we could use .spotdl files?

@AkaTenshi
Copy link
Contributor Author

  • Well, I put it in download.py since a list of songs is needed, which is currently created in line 35 (songs = get_simple_songs(query)). To move my changes to Downloader this line would need to be moved as well. I'll need to check if I can refactor it in a backwards-compatible way.
  • What do you mean exactly by using .spotdl-files? Simply enforcing the .spodl file-extension in the argument? Changing the argument to boolean + automatically creating an archive-file in the script instead of letting the user supply a path? As requested in Add download archive just like ytdl --archive #1597 I used a similar design to youtube-dl.

@Silverarmor Silverarmor changed the base branch from master to dev September 24, 2022 06:28
tests/utils/test_archive.py Outdated Show resolved Hide resolved
@xnetcat xnetcat merged commit 38cc8c5 into spotDL:dev Oct 1, 2022
@AkaTenshi AkaTenshi deleted the add_archive branch October 1, 2022 22:46
@sanbroz
Copy link

sanbroz commented Nov 7, 2022

While downloading playlist, archive file is updated at the completion of complete playlist download, If program is terminated in between then archive file is not updated. We should update the archive file after downloading every single song of playlist.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants