Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ls -R could run in parallel to improve the performances (aka ls -R is much slower than GNU's) #2069

Closed
sylvestre opened this issue Apr 11, 2021 · 5 comments · Fixed by #2083
Labels
good first issue For newcomers!

Comments

@sylvestre
Copy link
Sponsor Contributor

Running ls -R on a big tree (firefox sources for example), we are 5 times slower than the GNU version.

$ hyperfine --warmup 2   --show-output 'ls -al -R mozilla-central.hg  > /dev/null' '/home/sylvestre/dev/debian/coreutils//target/release/coreutils ls -al -R mozilla-central.hg > /dev/null'  
...
Summary
  'ls -al -R /home/sylvestre/dev/mozilla/mozilla-central.hg  > /dev/null' ran
    4.71 ± 0.31 times faster than '/home/sylvestre/dev/debian/coreutils//target/release/coreutils ls -al -R mozilla-central.hg > /dev/null'

Running it in parallel could help.

@sylvestre sylvestre added the good first issue For newcomers! label Apr 11, 2021
@siebenHeaven
Copy link
Contributor

siebenHeaven commented Apr 11, 2021

Hello,

I'm looking into contributing to rust project as the next step for my "Learning Rust" journey.

This seems like a good issue to take up - I have some experience with parallelizing with rayon - is that something we want to explore here?

I am also curious - would looking for other performance bottlenecks before parallelizing would be worthwhile?
(eg. - seems like there's high number of syscalls that we could avoid

anup@LAPTOP-29TC204U:~/oss/coreutils$ strace -c /home/anup/oss/coreutils/target/release/coreutils ls -al -R /home/anup/linux_kernel_labs/src/linux/ > /dev/null
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 20.77   13.192816          52    249169        43 statx
 15.16    9.629992          54    177906           openat
 14.29    9.072997          50    179638           close
 14.03    8.911072          51    172847           read
 13.78    8.749061          49    177907           fstat
 13.18    8.371881          48    172833           lseek
  7.53    4.779508          49     96540           write
  0.94    0.594773          58     10131           getdents64
  0.17    0.109381          63      1732      1732 connect
  0.15    0.097318          56      1732           socket
  0.00    0.002759          64        43           readlink
  0.00    0.000789          56        14           brk
  0.00    0.000063          21         3           munmap
  0.00    0.000050          16         3           sigaltstack
  0.00    0.000000           0         1           poll
  0.00    0.000000           0        32           mmap
  0.00    0.000000           0        10           mprotect
  0.00    0.000000           0         7           rt_sigaction
  0.00    0.000000           0         1           rt_sigprocmask
  0.00    0.000000           0         1         1 ioctl
  0.00    0.000000           0         8           pread64
  0.00    0.000000           0         1         1 access
  0.00    0.000000           0         1           execve
  0.00    0.000000           0         2         1 arch_prctl
  0.00    0.000000           0         1           futex
  0.00    0.000000           0         1           sched_getaffinity
  0.00    0.000000           0         1           set_tid_address
  0.00    0.000000           0         1           set_robust_list
  0.00    0.000000           0         2           prlimit64
  0.00    0.000000           0         1           getrandom
------ ----------- ----------- --------- --------- ----------------
100.00   63.512460               1240569      1778 total

anup@LAPTOP-29TC204U:~/oss/coreutils$ strace -c ls -al -R /home/anup/linux_kernel_labs/src/linux/ > /dev/null
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 32.95    5.317159          54     97813     97813 getxattr
 28.39    4.581664          55     82925     82925 lgetxattr
 28.03    4.523563          54     82925           lstat
  3.63    0.585143          58      9953           getdents64
  3.12    0.504071          50      9982           fstat
  1.78    0.287848          57      5014         9 openat
  1.70    0.274287          54      5011           close
  0.38    0.061963          51      1198           write
  0.01    0.001983          53        37           readlink
  0.00    0.000154           9        17           read
  0.00    0.000098          24         4           lseek
  0.00    0.000071          71         1           mremap
  0.00    0.000062           1        47           mmap
  0.00    0.000060          15         4           brk
  0.00    0.000000           0        10           mprotect
  0.00    0.000000           0         2           munmap
  0.00    0.000000           0         2           rt_sigaction
  0.00    0.000000           0         1           rt_sigprocmask
  0.00    0.000000           0         3         3 ioctl
  0.00    0.000000           0         8           pread64
  0.00    0.000000           0         2         2 access
  0.00    0.000000           0         4           socket
  0.00    0.000000           0         4         4 connect
  0.00    0.000000           0         1           execve
  0.00    0.000000           0         2         2 statfs
  0.00    0.000000           0         2         1 arch_prctl
  0.00    0.000000           0         1           futex
  0.00    0.000000           0         1           set_tid_address
  0.00    0.000000           0         1           set_robust_list
  0.00    0.000000           0         1           prlimit64
------ ----------- ----------- --------- --------- ----------------
100.00   16.138126                294976    180759 total

)

@ArniDagur
Copy link
Contributor

@siebenHeaven Great. I would also look at #1379; du is a tool which benefits from parallel directory walking in a similar manner. The first thing I'd do before rolling our own directory walker is looking at ignore. That's the super fast parallel directory walker that ripgrep and fd use.

@siebenHeaven
Copy link
Contributor

Thanks @ArniDagur - I'll take a look. ignore seems promising.

@tertsdiepraam
Copy link
Member

Hi @siebenHeaven! Awesome that you want to pick this up! I think you are right that there are a lot of parts of ls that could be improved without parallelization. For example, the metadata for each file gets retrieved separately for each part of the program that uses it (this is probably where those syscalls come from). It could indeed be a relatively easy performance win to store the metadata along with the paths to reuse it. String handling is also a little bit clunky in certain places, because there are a lot of calls to OsString::to_string_lossy() (often multiple times per file) and some unnecessary allocations.

@siebenHeaven
Copy link
Contributor

Yes,
I've been trying to cache the metadata so that the calls to statx syscalls will be saved + trying to reduce unnecessary clones along the way in hot paths - will get back on some results with that soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue For newcomers!
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants