Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce FUSE package import time #184

Closed
orso82 opened this issue Jan 6, 2023 · 15 comments
Closed

Reduce FUSE package import time #184

orso82 opened this issue Jan 6, 2023 · 15 comments
Labels

Comments

@orso82
Copy link
Member

orso82 commented Jan 6, 2023

@lstagner says v1.10 and maybe v1.9 is going to have extension packages that can be optionally loaded. Basically built in Requires.jl This would help package load times and CI

@orso82 orso82 added the JuliaHub label Jan 6, 2023
@orso82 orso82 changed the title Reduce load time Reduce FUSE package load time Jan 6, 2023
@orso82 orso82 changed the title Reduce FUSE package load time Reduce FUSE package import time Jan 6, 2023
@ChrisRackauckas
Copy link

SciML/DiffEqBase.jl#856 is how to do it with backwards compatibility. @anandijain let me know if you have issues with this

@orso82
Copy link
Member Author

orso82 commented Jan 17, 2023

Weak dependencies and extension packages in 1.9

  • Weak dependencies: version bounds on things that you do not depend on
  • Extension packages: for lazy loading of packages

@anandijain
Copy link

here is a dataframe created with https://github.com/anandijain/JuliaBenchmarker.jl/blob/2a9515357997031d35e430134e056c8d66288a54/src/JuliaBenchmarker.jl#L82-L99

of the @time_imports using FUSE

fuse_import_times.csv

the total using time was ~32 seconds, CUDA being the worst offender. It definitely seems like we could shave 5 seconds off of using time by making it a weak dep. I think we could probably do it with Plots too. So we could maybe save 7 seconds off.

I don't have a breakdown of precompile times, see JuliaLang/Pkg.jl#3328 for the issue on getting that implemented.

i still need to spend more time on this, but i just wanted to share

@anandijain
Copy link

anandijain commented Feb 7, 2023

heres a dataframe giving the breakdown of precompilation time for fuse. note that the sum of df.time > precomp time, since a lot of packages are done in parallel.

FUSE_precomp.csv

so sum(df.time) gives ~ 1300 seconds. but wall time is ~350 seconds.

note that this isn't on mac (not affected by JuliaLang/julia#48473), and done right after rm -rf ~/.julia/compiled/v1.10/

julia> versioninfo()
Julia Version 1.10.0-DEV.524
Commit 2c619b77e04 (2023-02-07 12:45 UTC)
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 64 × AMD EPYC 7513 32-Core Processor
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-14.0.6 (ORCJIT, znver3)
  Threads: 8 on 64 virtual cores
Environment:
  JULIA_NUM_THREADS = 8

also here is a graph of the private registry packages dependencies in GAregistry, generated with https://github.com/anandijain/MyPkgGraph.jl

(i just realized this is slightly bugged)

Image

the goal here would be to have a better sense of how changes to fuse (weak deps, more snoop compile, etc) will improve precomp and using time.

i think it could be useful to put this into github actions, so along with julia/runtests we also get reports on using/precomp time.

@anandijain
Copy link

not that important, but the graph is "wrong" just because it's been so long since FUSE tagged a release (since i generate the graph from the registry)

@anandijain
Copy link

another thing that i'd like to do is combine this with JET or snoopcompile. this would then give us the methods we would need to change in order to excise a dependency (plots for example)

julia> Aqua.test_stale_deps(NNeutronics)
NNeutronics [a9424c20-d414-11ec-167b-9106c24d956c]: Test Failed at /home/anandijain/.julia/packages/Aqua/utObL/src/stale_deps.jl:24
  Expression: result  true
   Evaluated: ⟪result: 😭 FAILED: NNeutronics [a9424c20-d414-11ec-167b-9106c24d956c]
    Some package(s) in `deps` of NNeutronics [a9424c20-d414-11ec-167b-9106c24d956c] are not loaded during via `using NNeutronics`.
    * Interact [c601a237-2ae4-5e1e-952c-7a85b0c7eef1]
    * Plots [91a5bcdd-55d7-5caf-9e0b-520d859cae80]
    * ProgressMeter [92933f4c-e287-5a05-a399-4b506db050ca]
    
    To ignore from stale dependency detection, pass the package name to `ignore` keyword argument of `Aqua.test_stale_deps` true
Stacktrace:
 [1] macro expansion
   @ ~/.juliaup/julia-2c619b77e0/share/julia/stdlib/v1.10/Test/src/Test.jl:477 [inlined]
 [2] macro expansion
   @ ~/.julia/packages/Aqua/utObL/src/stale_deps.jl:24 [inlined]
 [3] macro expansion
   @ ~/.juliaup/julia-2c619b77e0/share/julia/stdlib/v1.10/Test/src/Test.jl:1590 [inlined]
 [4] test_stale_deps(packages::Module; kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
   @ Aqua ~/.julia/packages/Aqua/utObL/src/stale_deps.jl:22
Test Summary:                                      | Fail  Total  Time
NNeutronics [a9424c20-d414-11ec-167b-9106c24d956c] |    1      1  0.4s

I think i see a few efficient paths to cleaning up our dependency tree. I'll want to run this on everything in the registry soon

@orso82
Copy link
Member Author

orso82 commented Feb 7, 2023

Let's target images package as the first to be imported as a weak dependency!
We use it only once and it is rarely used.

@orso82
Copy link
Member Author

orso82 commented Feb 7, 2023

@anandijain for packages that use Jupyter notebooks for some functionality (eg. EPEDNN, TGLFNN, NNeutronics) perhaps use two different Project.toml files. One for the library part, the other for the Jupyter part.

@orso82
Copy link
Member Author

orso82 commented Feb 7, 2023

How to get a quick dependency tree for a package (pseudocode):

Image

Though this will not be de-duplicated

Also look at
Image

but it can be slow

@anandijain
Copy link

anandijain commented Feb 8, 2023

note that to run this you need my fork https://github.com/anandijain/Aqua.jl/tree/aj/stale which just makes getting the stale dependencies a bit easier.

we can see

using Aqua, FUSE # assuming FUSE is current active env
using Pkg

pkg_specs = Pkg.Operations.load_all_deps(Pkg.API.EnvCache())
to_test = pkg_specs[findall(x->!isnothing(x.path), pkg_specs)]

pkg_ids = map(p -> Base.PkgId(p.uuid, p.name), to_test)
stale_deps = Aqua.get_stale_deps(pkg_ids)
# ps = to_test .=> stale_deps
map(x->x.name, pkg_ids) .=> stale_deps

so for each of the FUSE packages, we get a list of the dependencies that are not used

16-element Vector{Pair{String, Vector{Base.PkgId}}}:
   "FiniteElementHermite" => []
                 "CHEASE" => [Revise [295af30f-e4ad-537b-8983-00126c2a3abe]]
                 "EPEDNN" => [Revise [295af30f-e4ad-537b-8983-00126c2a3abe], Pickle [fbb45041-c46e-462f-888f-7c521cafbc2c], ProgressMeter [92933f4c-e287-5a05-a399-4b506db050ca]]
            "NNeutronics" => [Interact [c601a237-2ae4-5e1e-952c-7a85b0c7eef1], Plots [91a5bcdd-55d7-5caf-9e0b-520d859cae80], ProgressMeter [92933f4c-e287-5a05-a399-4b506db050ca]]
                 "TAUENN" => [Revise [295af30f-e4ad-537b-8983-00126c2a3abe], Plots [91a5bcdd-55d7-5caf-9e0b-520d859cae80]]
            "Equilibrium" => []
                    "QED" => [PackageCompiler [9b87118b-4619-50d2-8e1e-99f35a4d4d9d]]
           "VacuumFields" => []
                   "IMAS" => [Revise [295af30f-e4ad-537b-8983-00126c2a3abe], ProgressMeter [92933f4c-e287-5a05-a399-4b506db050ca]]
 "MillerExtendedHarmonic" => []
  "CoordinateConventions" => [Revise [295af30f-e4ad-537b-8983-00126c2a3abe]]
        "FusionMaterials" => [Revise [295af30f-e4ad-537b-8983-00126c2a3abe]]
   "SimulationParameters" => []
     "Fortran90Namelists" => []
                 "IMASDD" => [Revise [295af30f-e4ad-537b-8983-00126c2a3abe], ProgressMeter [92933f4c-e287-5a05-a399-4b506db050ca]]
                 "TGLFNN" => [Revise [295af30f-e4ad-537b-8983-00126c2a3abe], Plots [91a5bcdd-55d7-5caf-9e0b-520d859cae80], Pickle [fbb45041-c46e-462f-888f-7c521cafbc2c]]

I think we discussed it, but I forgot what was said, why do we have Revise in all of these instead of just have the make install just add it to the global environment and then tell the team to make sure it's in the startup.jl

theres nothing inherently wrong with it, but it is a bit odd (in that I haven't seen it done anywhere else). That said, I had never seen the Makefile, but that works really well. So it could be I'm just missing why it needs to be there

@anandijain
Copy link

I just did a new @time_imports using FUSE and GPU related packages take 5 seconds out of 22.1 (~22%).
so making some decisions on handling #212 will be a big part of reducing import times.

@anandijain
Copy link

anandijain commented Feb 8, 2023

if you use the latest https://github.com/anandijain/MyPkgGraph.jl this code will allow you to make the deptree of FUSE (or anything).

One thing that I may have forgotten to say is, I'm pretty sure https://github.com/tfiers/PkgGraph.jl doesn't work with private registries. He makes prettier graphs and has some nice features that I don't have. But it is like 100X slower

using MyPkgGraph, Catlab.Graphics
draw(r, s; dir=:out) = to_graphviz(MyPkgGraph.my_depgraph(r, s; dir); node_labels=:label)
draw(s; dir=:out) = draw(GENERAL_REGISTRY, s; dir)

r1 = MyPkgGraph.REGISTRIES[findfirst(reg.name == "GAregistry" for reg in MyPkgGraph.REGISTRIES)]
r2 = MyPkgGraph.GENERAL_REGISTRY
r3 = MyPkgGraph.merge_registries(r1, r2)
g = registry_graph(r3)
draw(r3, "FUSE")

@orso82
Copy link
Member Author

orso82 commented Feb 9, 2023

@anandijain I have removed the Revise dependency from all packages under ProjectTorreyPines that relate to FUSE, and I now just add Revise to the global environment.
Thank you for the suggestion!

@anandijain
Copy link

Nice! @orso82 https://github.com/ProjectTorreyPines/FUSE.jl/actions/runs/4136836312/jobs/7151239707#step:7:499

We need to remove it from being loaded in test files as well

@orso82
Copy link
Member Author

orso82 commented Feb 14, 2023

  1. Use Flux from PR that has CUDA as weak dependency (do it through the FUSE Makefile)
  2. Try to make a dev CUDA dummy package (same UUID very high version) ?

@orso82 orso82 closed this as completed Mar 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants