Add archive to skip downloaded songs #1602

AkaTenshi · 2022-09-22T23:51:34Z

Title

Add an archive file to skip downloaded songs analog to yt-dlp.

Description

After the successful download of a song it's url is put into a cache.
Pre-download every song's url is checked against the cache and skipped if found since it was already downloaded.
The cache is saved to a file after all downloads finished and on the next execution of spotdl loaded from the same file before downloads start.
This way the cache can be kept among executions.

Related Issue

#1597

Motivation and Context

There is already a mechanism to detect and skip already downloaded files.
However is does need additional network requests to the provider before skipping a song which hurts performance.
In addition for the detection to work the filename of the already downloaded file has to be exactly the same as the checked one.
If you download a playlist completely and then download it again later with a changed output-template (e.g. add the year into the name) all song are redownloaded and exist 2 times with different names.
With the archive users have to ability to skip already downloaded songs even if they are not present at the exact location in the filesystem."
Of course the use of this functionality is completly optional.

How Has This Been Tested?

I downloaded my "saved" playlist twice while specifying an archive file.
The first execution downloaded 380/385 songs successfully (5 were not found on youtube) and wrote 380 urls into the archive file.
On the second execution it loaded 380 urls from the archive file into the cache and only tried to download the remaining 5/385 songs.
There is also a new testcase for saving + loading the archive in test/utils/test_archive.py.

Screenshots (if appropriate)

Types of Changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)

Checklist

My code follows the code style of this project
My change requires a change to the documentation
I have updated the documentation accordingly
I have read the CONTRIBUTING document
I have read the CORE VALUES document
I have added tests to cover my changes
All new and existing tests passed

xnetcat

Looks good but maybe you could move the archive check from download.py to downloader.

Also maybe we could use .spotdl files?

AkaTenshi · 2022-09-23T11:34:13Z

Well, I put it in download.py since a list of songs is needed, which is currently created in line 35 (songs = get_simple_songs(query)). To move my changes to Downloader this line would need to be moved as well. I'll need to check if I can refactor it in a backwards-compatible way.
What do you mean exactly by using .spotdl-files? Simply enforcing the .spodl file-extension in the argument? Changing the argument to boolean + automatically creating an archive-file in the script instead of letting the user supply a path? As requested in Add download archive just like ytdl --archive #1597 I used a similar design to youtube-dl.

tests/utils/test_archive.py

sanbroz · 2022-11-07T07:19:39Z

While downloading playlist, archive file is updated at the completion of complete playlist download, If program is terminated in between then archive file is not updated. We should update the archive file after downloading every single song of playlist.

AkaTenshi added 2 commits September 23, 2022 01:16

Add archive argument + class and integrate into download.

e1dd48e

Add new archive-parameter to usage docs.

4d47f96

xnetcat self-requested a review September 23, 2022 08:23

xnetcat reviewed Sep 23, 2022

View reviewed changes

Fix mypy errors.

1500cba

Silverarmor changed the base branch from master to dev September 24, 2022 06:28

Merge branch 'dev' into add_archive

5ff2f6f

xnetcat reviewed Oct 1, 2022

View reviewed changes

tests/utils/test_archive.py Outdated Show resolved Hide resolved

remove unnecessary decorator

134e348

xnetcat approved these changes Oct 1, 2022

View reviewed changes

xnetcat merged commit 38cc8c5 into spotDL:dev Oct 1, 2022

AkaTenshi deleted the add_archive branch October 1, 2022 22:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add archive to skip downloaded songs #1602

Add archive to skip downloaded songs #1602

AkaTenshi commented Sep 22, 2022 •

edited

Loading

xnetcat left a comment

AkaTenshi commented Sep 23, 2022

sanbroz commented Nov 7, 2022

Add archive to skip downloaded songs #1602

Add archive to skip downloaded songs #1602

Conversation

AkaTenshi commented Sep 22, 2022 • edited Loading

Title

Description

Related Issue

Motivation and Context

How Has This Been Tested?

Screenshots (if appropriate)

Types of Changes

Checklist

xnetcat left a comment

Choose a reason for hiding this comment

AkaTenshi commented Sep 23, 2022

sanbroz commented Nov 7, 2022

AkaTenshi commented Sep 22, 2022 •

edited

Loading