-
Notifications
You must be signed in to change notification settings - Fork 335
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New Development Binaries System #827
Comments
Note: until this is resolved, there will be no new binary update :( |
I have pushed up a test of one potential solution (downloading directly from llama.cpp). You can see it here: https://github.com/SciSharp/LLamaSharp/blob/july-2024-binaries/LLama/LLamaSharp.csproj#L55 From a usability perspective this is pretty nice. Changing versions jsut requires updating that single However, llama.cpp do not publish any shared objects for Linux! |
In my opinion, it's a better option to drop the local binaries in git everywhere except the backend package building. It's similar to your option2 and I'd like to give a detailed description of it. All the binaries could be removed from LLamaSharp git repo. Instead, we use the nuget backend packages in our example project. Where to put the binaries could be flexible because users are not supposed to touch it. For example, we can put the binaries in another git repo with git-lfs and the reference it as a submodule. In this case, when binaries update is merged into master branch, we must publish a new release at once to make the new binaries available on nuget. To keep our CI available, we need to update the submodules before running the test in workflows. Besides, we need to copy the binaries to output folder in However, I have to say that the unit test coverage would be a problem. What I said above is based on an assumption that unit test is only for CI and examples are only for users. But actually, the example project is also responsible for test coverage now, which needs to be improved in the future. I think this way is more clear for users because they only need to care about the nuget packages. Any ideas? @martindevans |
What do you think of the potential solution I pushed up here. With this idea, the process would be:
The example linked above is downloading directly from llama.cpp, but that isn't an option at the moment (they don't publish Linux shared objects). So it would be modified to download from our own release (e.g. https://github.com/martindevans/LLamaSharp/releases/tag/test-binaries), but the idea is the same.
Just a note about git-lfs, unfortunately I don't think we can use it at all. The GitHub free limits on git-lfs are tiny - just 1GiB a month across your whole account! So downloading 5x CUDA binaries would totally exhaust your entire account allocation for the whole month. That's not something I want to risk happening by accident! |
It's ok but I think we should use the nuget package directly in our example project. Thus it will be more clear for new users. The only thing we need to do is to publish a new nuget package every time we update the binaries. Since git-lfs is limited on github, then downloading from the release is a good option. In this way there will be another problem. If you want to run the github workflows, you need to let the unit test project downloads the new binaries. The binaries are put in release. However, we shouldn't publish a release without passing all the workflows. The only way I can come up with is to delete the release if the workflow fails. |
I think if we published the releases ourselves we'd basically end up with two types of "release".
It's pretty messy :( Splitting out the releases to another repo (which exists just for binary releases) might be a way to work around that, but it's a bit of a pain to have multiple repos. I'm going to go and open an issue on the llama.cpp repo asking about shared objects in their releases. That way we would be able to skip the entire build step, our binaries would be the "official" ones, and we wouldn't have to mess around with any binary-only-releases. That'll probably be slower than anything we do ourselves here, but it seems like the best overall solution. |
Ok I changed my mind, I was typing up the feature request and it would be a colossal increase in the number of binaries they would need to compile for every release. I don't think there's any chance they would do it! |
I've created a new release in this repo, just to test what it would look like. It's here: https://github.com/SciSharp/LLamaSharp/releases/tag/test-release-please-ignore. If we decide to go ahead with investigating this approach I'll attach some binaries to it. If we went with a "binary only" release in this repo, we would do this:
|
Hmm, what about compressing the deps (at the end of the build action in |
The limit is 100MB and according to an issue in the llama.cpp repo discussing the size these binaries are going to grow (support for new GPUs, new kernels etc). Given how close we already are to the limit when zipped that'd be a temporary solution. It is the probably the easiest option though. |
Yes, indeed. I don't know how much/fast the binaries on the llama.cpp side will grow, but it sounds like a matter of time until we run into the same issue. If the transition to the LLamaSharp-Binaries workflow goes smooth, a temporary solution could probably be skipped entirely. I think it's cleaner to split the binaries to LLamaSharp-Binaries, to avoid confusion in the LLamaSharp releases section. So, when doing a binary update:
|
Created a repo here for development: https://github.com/martindevans/LLamaSharpBinaries/releases/tag/1c5eba6f8e62 I'll put together a prototype downloading binaries from here, and will transfer ownership to SciSharp if we go ahead with this approach. |
Draft PR here: #833 |
Released now with 0.14.0 |
Description
We have a problem with the next binary update. The code changes have been done, and are sitting in this branch. The binaries have been compiled, and are sitting in this action.
However, the new binaries from llama.cpp are now too large to commit to GitHub (over 100MB). The push simply fails. GitHub suggests git-lfs, but this does not seem like a practical solition because the free bandwidth is too low - we would exhaust it with just 5 files. Therefore, we can no longer commit the binaries.
Note that this does not affect the final distribution since that is done through nuget, it just affects development in this repo.
Ideas:
The text was updated successfully, but these errors were encountered: