Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Repo Server high number of git requests when no changes are requested #12878

Open
d-wierdsma opened this issue Mar 15, 2023 · 18 comments
Open

Repo Server high number of git requests when no changes are requested #12878

d-wierdsma opened this issue Mar 15, 2023 · 18 comments
Labels
bug Something isn't working

Comments

@d-wierdsma
Copy link

Checklist:

  • [ x ] I've searched in the docs and FAQ for my answer: https://bit.ly/argocd-faq.
  • [ x ] I've included steps to reproduce the bug.
  • [ x ] I've pasted the output of argocd version.

Describe the bug

After upgrading to ArgoCD version 2.6.4 from 2.6.2 we experienced an issue where Repo Server was unable to resolve a git client and forced many apps into an Unknown state. When I manually refreshed the apps they were able to resolve the git client without issues, however they would automatically refresh on their own leading once again to an Unknown state regardless of them now being Healthy. As you can see from screenshots it appears that the repoServer kept attempting connections to the git client in large amounts seen by the ls-remote dashboard panel screenshot.

Our current ArgoCD cluster has 5 clusters total that it deploys to using an App of Apps generator method, we currently have ~120 applications managed by this centralized ArgoCD cluster.

Our current setup for applications is roughly as follows:
Per team and environment we have an App of Apps that creates another set of App of Apps for each application that we would like to deploy, this sub App of Apps will then deploy the application (for example prometheus) to the appropriate external clusters.

We are also utilizing the new Multi-source applications to use values files contained in our private git repositories with a mix of private and public helm charts.

To Reproduce

We have disabled the automatic refresh on apps in favour for git webhooks to refresh apps when there are changes to the repos

Have ~100 applications on a HA ArgoCD setup, with the following relevant settings:

repoServer:
  resources:
    limits:
      cpu: '1'
      memory: 512Mi
    requests:
      cpu: 250m
      memory: 256Mi

  env:
  - name: ARGOCD_GIT_ATTEMPTS_COUNT
    value: "3"

configs:
  cm:
    timeout.reconciliation: 0s
    
  params:
    reposerver.parallelism.limit: 10

I assume that the repoServer reached a rate-limit built into our internal gitlab instance and kept sending requests after getting failures.

Expected behavior

I expect Repo Server to eventually fail on calls to the git service and not keep sending requests when there are no changes and the application is healthy.

Screenshots

https://user-images.githubusercontent.com/13317139/225333581-0e118090-e311-457a-987e-0a9861860129.png
https://user-images.githubusercontent.com/13317139/225334095-3786b36b-87ab-45b4-a572-18593fa371ee.png

Version
Unable to determine the exact SHA as it took out our git version, but it was v2.6.4 as of approx. Tue Mar 14 12:03:49 2023 -0400

Logs

		
{"level":"info","msg":"manifest cache hit: \u0026ApplicationSource{RepoURL:https://chartmuseum....,Path:,TargetRevision:0.0.4,Helm:\u0026ApplicationSourceHelm{ValueFiles:[$values/apps/core/helm-values/cnc/kube-spot-termination-notice-handler-values.yaml],Parameters:[]HelmParameter{},ReleaseName:kube-spot-termination-notice-handler,Values:,FileParameters:[]HelmFileParameter{},Version:,PassCredentials:false,IgnoreMissingValueFiles:false,SkipCrds:false,},Kustomize:nil,Directory:nil,Plugin:nil,Chart:kube-spot-termination-notice-handler,Ref:,}/0.0.4","time":"2023-03-14T17:30:49Z"}
{"grpc.code":"OK","grpc.method":"GenerateManifest","grpc.service":"repository.RepoServerService","grpc.start_time":"2023-03-14T17:30:49Z","grpc.time_ms":264.745,"level":"info","msg":"finished unary call with code OK","span.kind":"server","system":"grpc","time":"2023-03-14T17:30:49Z"}
{"level":"info","msg":"manifest cache miss: \u0026ApplicationSource{RepoURL:git@git.internal.....git,Path:,TargetRevision:HEAD,Helm:nil,Kustomize:nil,Directory:nil,Plugin:nil,Chart:,Ref:values,}/b2979fce4034fbc307149e628e05ce3bd5db892f","time":"2023-03-14T17:30:49Z"}
{"grpc.code":"OK","grpc.method":"GenerateManifest","grpc.service":"repository.RepoServerService","grpc.start_time":"2023-03-14T17:30:49Z","grpc.time_ms":75.812,"level":"info","msg":"finished unary call with code OK","span.kind":"server","system":"grpc","time":"2023-03-14T17:30:49Z"}
{"level":"info","msg":"manifest cache miss: \u0026ApplicationSource{RepoURL:git@git.internal.....git,Path:,TargetRevision:HEAD,Helm:nil,Kustomize:nil,Directory:nil,Plugin:nil,Chart:,Ref:values,}/b2979fce4034fbc307149e628e05ce3bd5db892f","time":"2023-03-14T17:30:50Z"}
{"grpc.code":"OK","grpc.method":"GenerateManifest","grpc.service":"repository.RepoServerService","grpc.start_time":"2023-03-14T17:30:50Z","grpc.time_ms":2.291,"level":"info","msg":"finished unary call with code OK","span.kind":"server","system":"grpc","time":"2023-03-14T17:30:50Z"}
{"level":"info","msg":"manifest cache hit: \u0026ApplicationSource{RepoURL:git@git.internal.....git,Path:apps/prod/manifests/cnc/cluster-autoscaler,TargetRevision:HEAD,Helm:nil,Kustomize:nil,Directory:\u0026ApplicationSourceDirectory{Recurse:true,Jsonnet:ApplicationSourceJsonnet{ExtVars:[]JsonnetVar{},TLAs:[]JsonnetVar{},Libs:[],},Exclude:,Include:,},Plugin:nil,Chart:,Ref:,}/b2979fce4034fbc307149e628e05ce3bd5db892f","time":"2023-03-14T17:30:50Z"}
{"grpc.code":"OK","grpc.method":"GenerateManifest","grpc.service":"repository.RepoServerService","grpc.start_time":"2023-03-14T17:30:50Z","grpc.time_ms":1.021,"level":"info","msg":"finished unary call with code OK","span.kind":"server","system":"grpc","time":"2023-03-14T17:30:50Z"}
{"level":"info","msg":"manifest cache hit: \u0026ApplicationSource{RepoURL:https://kubernetes-sigs.github.io/metrics-server/,Path:,TargetRevision:3.8.3,Helm:\u0026ApplicationSourceHelm{ValueFiles:[$values/apps/prod/helm-values/cnc/metrics-server-values.yaml],Parameters:[]HelmParameter{},ReleaseName:metrics-server,Values:,FileParameters:[]HelmFileParameter{},Version:,PassCredentials:false,IgnoreMissingValueFiles:false,SkipCrds:false,},Kustomize:nil,Directory:nil,Plugin:nil,Chart:metrics-server,Ref:,}/3.8.3","time":"2023-03-14T17:30:50Z"}
{"grpc.code":"OK","grpc.method":"GenerateManifest","grpc.service":"repository.RepoServerService","grpc.start_time":"2023-03-14T17:30:49Z","grpc.time_ms":278.795,"level":"info","msg":"finished unary call with code OK","span.kind":"server","system":"grpc","time":"2023-03-14T17:30:50Z"}
{"level":"info","msg":"manifest cache hit: \u0026ApplicationSource{RepoURL:git@git.internal.....git,Path:apps/core/manifests/cnc/cluster-autoscaler,TargetRevision:HEAD,Helm:nil,Kustomize:nil,Directory:\u0026ApplicationSourceDirectory{Recurse:true,Jsonnet:ApplicationSourceJsonnet{ExtVars:[]JsonnetVar{},TLAs:[]JsonnetVar{},Libs:[],},Exclude:,Include:,},Plugin:nil,Chart:,Ref:,}/b2979fce4034fbc307149e628e05ce3bd5db892f","time":"2023-03-14T17:30:50Z"}
{"level":"info","msg":"manifest cache hit: \u0026ApplicationSource{RepoURL:git@git.internal.....git,Path:apps/test/manifests/cnc/external-snapshotter,TargetRevision:HEAD,Helm:nil,Kustomize:nil,Directory:\u0026ApplicationSourceDirectory{Recurse:true,Jsonnet:ApplicationSourceJsonnet{ExtVars:[]JsonnetVar{},TLAs:[]JsonnetVar{},Libs:[],},Exclude:,Include:,},Plugin:nil,Chart:,Ref:,}/b2979fce4034fbc307149e628e05ce3bd5db892f","time":"2023-03-14T17:30:50Z"}
{"grpc.code":"OK","grpc.method":"GenerateManifest","grpc.service":"repository.RepoServerService","grpc.start_time":"2023-03-14T17:30:50Z","grpc.time_ms":1.348,"level":"info","msg":"finished unary call with code OK","span.kind":"server","system":"grpc","time":"2023-03-14T17:30:50Z"}
{"level":"info","msg":"manifest cache miss: \u0026ApplicationSource{RepoURL:git@git.internal.....git,Path:,TargetRevision:HEAD,Helm:nil,Kustomize:nil,Directory:nil,Plugin:nil,Chart:,Ref:values,}/b2979fce4034fbc307149e628e05ce3bd5db892f","time":"2023-03-14T17:30:50Z"}
{"grpc.code":"OK","grpc.method":"GenerateManifest","grpc.service":"repository.RepoServerService","grpc.start_time":"2023-03-14T17:30:50Z","grpc.time_ms":2.362,"level":"info","msg":"finished unary call with code OK","span.kind":"server","system":"grpc","time":"2023-03-14T17:30:50Z"}
{"grpc.code":"OK","grpc.method":"GenerateManifest","grpc.service":"repository.RepoServerService","grpc.start_time":"2023-03-14T17:30:50Z","grpc.time_ms":1.619,"level":"info","msg":"finished unary call with code OK","span.kind":"server","system":"grpc","time":"2023-03-14T17:30:50Z"}
{"level":"info","msg":"manifest cache hit: \u0026ApplicationSource{RepoURL:git@git.internal.....git,Path:apps/dev/manifests/cnc/cluster-autoscaler,TargetRevision:HEAD,Helm:nil,Kustomize:nil,Directory:\u0026ApplicationSourceDirectory{Recurse:true,Jsonnet:ApplicationSourceJsonnet{ExtVars:[]JsonnetVar{},TLAs:[]JsonnetVar{},Libs:[],},Exclude:,Include:,},Plugin:nil,Chart:,Ref:,}/b2979fce4034fbc307149e628e05ce3bd5db892f","time":"2023-03-14T17:30:50Z"}
{"grpc.code":"OK","grpc.method":"GenerateManifest","grpc.service":"repository.RepoServerService","grpc.start_time":"2023-03-14T17:30:50Z","grpc.time_ms":2.416,"level":"info","msg":"finished unary call with code OK","span.kind":"server","system":"grpc","time":"2023-03-14T17:30:50Z"}
{"level":"info","msg":"manifest cache hit: \u0026ApplicationSource{RepoURL:git@git.internal.....git,Path:apps/prod/manifests/cnc/external-snapshotter,TargetRevision:HEAD,Helm:nil,Kustomize:nil,Directory:\u0026ApplicationSourceDirectory{Recurse:true,Jsonnet:ApplicationSourceJsonnet{ExtVars:[]JsonnetVar{},TLAs:[]JsonnetVar{},Libs:[],},Exclude:,Include:,},Plugin:nil,Chart:,Ref:,}/b2979fce4034fbc307149e628e05ce3bd5db892f","time":"2023-03-14T17:30:50Z"}
{"grpc.code":"OK","grpc.method":"GenerateManifest","grpc.service":"repository.RepoServerService","grpc.start_time":"2023-03-14T17:30:50Z","grpc.time_ms":12.053,"level":"info","msg":"finished unary call with code OK","span.kind":"server","system":"grpc","time":"2023-03-14T17:30:50Z"}
{"level":"info","msg":"manifest cache hit: \u0026ApplicationSource{RepoURL:git@git.internal.....git,Path:apps/test/manifests/cnc/cluster-autoscaler,TargetRevision:HEAD,Helm:nil,Kustomize:nil,Directory:\u0026ApplicationSourceDirectory{Recurse:true,Jsonnet:ApplicationSourceJsonnet{ExtVars:[]JsonnetVar{},TLAs:[]JsonnetVar{},Libs:[],},Exclude:,Include:,},Plugin:nil,Chart:,Ref:,}/b2979fce4034fbc307149e628e05ce3bd5db892f","time":"2023-03-14T17:30:50Z"}
{"grpc.code":"OK","grpc.method":"GenerateManifest","grpc.service":"repository.RepoServerService","grpc.start_time":"2023-03-14T17:30:50Z","grpc.time_ms":205.333,"level":"info","msg":"finished unary call with code OK","span.kind":"server","system":"grpc","time":"2023-03-14T17:30:50Z"}
{"level":"info","msg":"manifest cache hit: \u0026ApplicationSource{RepoURL:git@git.internal.....git,Path:apps/dev/manifests/cnc/cluster-autoscaler,TargetRevision:HEAD,Helm:nil,Kustomize:nil,Directory:\u0026ApplicationSourceDirectory{Recurse:true,Jsonnet:ApplicationSourceJsonnet{ExtVars:[]JsonnetVar{},TLAs:[]JsonnetVar{},Libs:[],},Exclude:,Include:,},Plugin:nil,Chart:,Ref:,}/b2979fce4034fbc307149e628e05ce3bd5db892f","time":"2023-03-14T17:30:50Z"}
{"grpc.code":"OK","grpc.method":"GenerateManifest","grpc.service":"repository.RepoServerService","grpc.start_time":"2023-03-14T17:30:50Z","grpc.time_ms":1.166,"level":"info","msg":"finished unary call with code OK","span.kind":"server","system":"grpc","time":"2023-03-14T17:30:50Z"}
{"level":"info","msg":"manifest cache hit: \u0026ApplicationSource{RepoURL:git@git.internal.....git,Path:apps/prod/manifests/cnc/cluster-autoscaler,TargetRevision:HEAD,Helm:nil,Kustomize:nil,Directory:\u0026ApplicationSourceDirectory{Recurse:true,Jsonnet:ApplicationSourceJsonnet{ExtVars:[]JsonnetVar{},TLAs:[]JsonnetVar{},Libs:[],},Exclude:,Include:,},Plugin:nil,Chart:,Ref:,}/b2979fce4034fbc307149e628e05ce3bd5db892f","time":"2023-03-14T17:30:50Z"}
{"grpc.code":"OK","grpc.method":"GenerateManifest","grpc.service":"repository.RepoServerService","grpc.start_time":"2023-03-14T17:30:50Z","grpc.time_ms":136.562,"level":"info","msg":"finished unary call with code OK","span.kind":"server","system":"grpc","time":"2023-03-14T17:30:50Z"}
{"level":"info","msg":"manifest cache hit: \u0026ApplicationSource{RepoURL:git@git.internal.....git,Path:apps/core/manifests/cnc/cluster-autoscaler,TargetRevision:HEAD,Helm:nil,Kustomize:nil,Directory:\u0026ApplicationSourceDirectory{Recurse:true,Jsonnet:ApplicationSourceJsonnet{ExtVars:[]JsonnetVar{},TLAs:[]JsonnetVar{},Libs:[],},Exclude:,Include:,},Plugin:nil,Chart:,Ref:,}/b2979fce4034fbc307149e628e05ce3bd5db892f","time":"2023-03-14T17:30:50Z"}
{"grpc.code":"OK","grpc.method":"GenerateManifest","grpc.service":"repository.RepoServerService","grpc.start_time":"2023-03-14T17:30:50Z","grpc.time_ms":111.975,"level":"info","msg":"finished unary call with code OK","span.kind":"server","system":"grpc","time":"2023-03-14T17:30:50Z"}
{"level":"info","msg":"manifest cache hit: \u0026ApplicationSource{RepoURL:git@git.internal.....git,Path:apps/test/manifests/cnc/external-snapshotter,TargetRevision:HEAD,Helm:nil,Kustomize:nil,Directory:\u0026ApplicationSourceDirectory{Recurse:true,Jsonnet:ApplicationSourceJsonnet{ExtVars:[]JsonnetVar{},TLAs:[]JsonnetVar{},Libs:[],},Exclude:,Include:,},Plugin:nil,Chart:,Ref:,}/b2979fce4034fbc307149e628e05ce3bd5db892f","time":"2023-03-14T17:30:50Z"}
{"grpc.code":"OK","grpc.method":"GenerateManifest","grpc.service":"repository.RepoServerService","grpc.start_time":"2023-03-14T17:30:50Z","grpc.time_ms":85.566,"level":"info","msg":"finished unary call with code OK","span.kind":"server","system":"grpc","time":"2023-03-14T17:30:50Z"}
{"error":"failed to get git client for repo git@git.internal.....git","grpc.code":"Unknown","grpc.method":"GenerateManifest","grpc.service":"repository.RepoServerService","grpc.start_time":"2023-03-14T17:30:49Z","grpc.time_ms":640.268,"level":"error","msg":"finished unary call with code Unknown","span.kind":"server","system":"grpc","time":"2023-03-14T17:30:50Z"}
{"level":"info","msg":"manifest cache hit: \u0026ApplicationSource{RepoURL:git@git.internal.....git,Path:apps/test/manifests/cnc/cluster-autoscaler,TargetRevision:HEAD,Helm:nil,Kustomize:nil,Directory:\u0026ApplicationSourceDirectory{Recurse:true,Jsonnet:ApplicationSourceJsonnet{ExtVars:[]JsonnetVar{},TLAs:[]JsonnetVar{},Libs:[],},Exclude:,Include:,},Plugin:nil,Chart:,Ref:,}/b2979fce4034fbc307149e628e05ce3bd5db892f","time":"2023-03-14T17:30:50Z"}
{"grpc.code":"OK","grpc.method":"GenerateManifest","grpc.service":"repository.RepoServerService","grpc.start_time":"2023-03-14T17:30:50Z","grpc.time_ms":81.811,"level":"info","msg":"finished unary call with code OK","span.kind":"server","system":"grpc","time":"2023-03-14T17:30:50Z"}
{"level":"info","msg":"manifest cache hit: \u0026ApplicationSource{RepoURL:git@git.internal.....git,Path:apps/dev/manifests/cnc/external-snapshotter,TargetRevision:HEAD,Helm:nil,Kustomize:nil,Directory:\u0026ApplicationSourceDirectory{Recurse:true,Jsonnet:ApplicationSourceJsonnet{ExtVars:[]JsonnetVar{},TLAs:[]JsonnetVar{},Libs:[],},Exclude:,Include:,},Plugin:nil,Chart:,Ref:,}/b2979fce4034fbc307149e628e05ce3bd5db892f","time":"2023-03-14T17:30:50Z"}
{"grpc.code":"OK","grpc.method":"GenerateManifest","grpc.service":"repository.RepoServerService","grpc.start_time":"2023-03-14T17:30:50Z","grpc.time_ms":417.581,"level":"info","msg":"finished unary call with code OK","span.kind":"server","system":"grpc","time":"2023-03-14T17:30:50Z"}
{"level":"info","msg":"manifest cache hit: \u0026ApplicationSource{RepoURL:git@git.internal.....git,Path:apps/core/manifests/cnc/external-snapshotter,TargetRevision:HEAD,Helm:nil,Kustomize:nil,Directory:\u0026ApplicationSourceDirectory{Recurse:true,Jsonnet:ApplicationSourceJsonnet{ExtVars:[]JsonnetVar{},TLAs:[]JsonnetVar{},Libs:[],},Exclude:,Include:,},Plugin:nil,Chart:,Ref:,}/b2979fce4034fbc307149e628e05ce3bd5db892f","time":"2023-03-14T17:30:50Z"}
{"grpc.code":"OK","grpc.method":"GenerateManifest","grpc.service":"repository.RepoServerService","grpc.start_time":"2023-03-14T17:30:50Z","grpc.time_ms":1.642,"level":"info","msg":"finished unary call with code OK","span.kind":"server","system":"grpc","time":"2023-03-14T17:30:50Z"}
{"error":"failed to get git client for repo git@git.internal.....git","grpc.code":"Unknown","grpc.method":"GenerateManifest","grpc.service":"repository.RepoServerService","grpc.start_time":"2023-03-14T17:30:50Z","grpc.time_ms":712.672,"level":"error","msg":"finished unary call with code Unknown","span.kind":"server","system":"grpc","time":"2023-03-14T17:30:50Z"}
{"error":"failed to get git client for repo git@git.internal.....git","grpc.code":"Unknown","grpc.method":"GenerateManifest","grpc.service":"repository.RepoServerService","grpc.start_time":"2023-03-14T17:30:50Z","grpc.time_ms":769.005,"level":"error","msg":"finished unary call with code Unknown","span.kind":"server","system":"grpc","time":"2023-03-14T17:30:50Z"}
{"error":"failed to get git client for repo git@git.internal.....git","grpc.code":"Unknown","grpc.method":"GenerateManifest","grpc.service":"repository.RepoServerService","grpc.start_time":"2023-03-14T17:30:50Z","grpc.time_ms":851.914,"level":"error","msg":"finished unary call with code Unknown","span.kind":"server","system":"grpc","time":"2023-03-14T17:30:50Z"}
{"error":"failed to get git client for repo git@git.internal.....git","grpc.code":"Unknown","grpc.method":"GenerateManifest","grpc.service":"repository.RepoServerService","grpc.start_time":"2023-03-14T17:30:50Z","grpc.time_ms":803.429,"level":"error","msg":"finished unary call with code Unknown","span.kind":"server","system":"grpc","time":"2023-03-14T17:30:50Z"}
{"error":"failed to get git client for repo git@git.internal.....git","grpc.code":"Unknown","grpc.method":"GenerateManifest","grpc.service":"repository.RepoServerService","grpc.start_time":"2023-03-14T17:30:50Z","grpc.time_ms":693.523,"level":"error","msg":"finished unary call with code Unknown","span.kind":"server","system":"grpc","time":"2023-03-14T17:30:50Z"}
{"error":"failed to get git client for repo git@git.internal.....git","grpc.code":"Unknown","grpc.method":"GenerateManifest","grpc.service":"repository.RepoServerService","grpc.start_time":"2023-03-14T17:30:50Z","grpc.time_ms":866.32,"level":"error","msg":"finished unary call with code Unknown","span.kind":"server","system":"grpc","time":"2023-03-14T17:30:50Z"}
{"level":"info","msg":"manifest cache hit: \u0026ApplicationSource{RepoURL:git@git.internal.....git,Path:apps/core/manifests/cnc/cluster-autoscaler,TargetRevision:HEAD,Helm:nil,Kustomize:nil,Directory:\u0026ApplicationSourceDirectory{Recurse:true,Jsonnet:ApplicationSourceJsonnet{ExtVars:[]JsonnetVar{},TLAs:[]JsonnetVar{},Libs:[],},Exclude:,Include:,},Plugin:nil,Chart:,Ref:,}/b2979fce4034fbc307149e628e05ce3bd5db892f","time":"2023-03-14T17:30:51Z"}
{"grpc.code":"OK","grpc.method":"GenerateManifest","grpc.service":"repository.RepoServerService","grpc.start_time":"2023-03-14T17:30:51Z","grpc.time_ms":1.019,"level":"info","msg":"finished unary call with code OK","span.kind":"server","system":"grpc","time":"2023-03-14T17:30:51Z"}
{"level":"info","msg":"manifest cache hit: \u0026ApplicationSource{RepoURL:git@git.internal.....git,Path:apps/dev/manifests/cnc/cluster-autoscaler,TargetRevision:HEAD,Helm:nil,Kustomize:nil,Directory:\u0026ApplicationSourceDirectory{Recurse:true,Jsonnet:ApplicationSourceJsonnet{ExtVars:[]JsonnetVar{},TLAs:[]JsonnetVar{},Libs:[],},Exclude:,Include:,},Plugin:nil,Chart:,Ref:,}/b2979fce4034fbc307149e628e05ce3bd5db892f","time":"2023-03-14T17:30:51Z"}
{"grpc.code":"OK","grpc.method":"GenerateManifest","grpc.service":"repository.RepoServerService","grpc.start_time":"2023-03-14T17:30:51Z","grpc.time_ms":1.042,"level":"info","msg":"finished unary call with code OK","span.kind":"server","system":"grpc","time":"2023-03-14T17:30:51Z"}
{"level":"info","msg":"manifest cache hit: \u0026ApplicationSource{RepoURL:git@git.internal.....git,Path:apps/core/manifests/cnc/external-snapshotter,TargetRevision:HEAD,Helm:nil,Kustomize:nil,Directory:\u0026ApplicationSourceDirectory{Recurse:true,Jsonnet:ApplicationSourceJsonnet{ExtVars:[]JsonnetVar{},TLAs:[]JsonnetVar{},Libs:[],},Exclude:,Include:,},Plugin:nil,Chart:,Ref:,}/b2979fce4034fbc307149e628e05ce3bd5db892f","time":"2023-03-14T17:30:51Z"}
{"grpc.code":"OK","grpc.method":"GenerateManifest","grpc.service":"repository.RepoServerService","grpc.start_time":"2023-03-14T17:30:51Z","grpc.time_ms":4.56,"level":"info","msg":"finished unary call with code OK","span.kind":"server","system":"grpc","time":"2023-03-14T17:30:51Z"}
{"level":"info","msg":"manifest cache hit: \u0026ApplicationSource{RepoURL:git@git.internal.....git,Path:apps/prod/manifests/cnc/cluster-autoscaler,TargetRevision:HEAD,Helm:nil,Kustomize:nil,Directory:\u0026ApplicationSourceDirectory{Recurse:true,Jsonnet:ApplicationSourceJsonnet{ExtVars:[]JsonnetVar{},TLAs:[]JsonnetVar{},Libs:[],},Exclude:,Include:,},Plugin:nil,Chart:,Ref:,}/b2979fce4034fbc307149e628e05ce3bd5db892f","time":"2023-03-14T17:30:51Z"}
{"grpc.code":"OK","grpc.method":"GenerateManifest","grpc.service":"repository.RepoServerService","grpc.start_time":"2023-03-14T17:30:51Z","grpc.time_ms":4.268,"level":"info","msg":"finished unary call with code OK","span.kind":"server","system":"grpc","time":"2023-03-14T17:30:51Z"}
{"error":"failed to get git client for repo git@git.internal.....git","grpc.code":"Unknown","grpc.method":"GenerateManifest","grpc.service":"repository.RepoServerService","grpc.start_time":"2023-03-14T17:30:50Z","grpc.time_ms":827.495,"level":"error","msg":"finished unary call with code Unknown","span.kind":"server","system":"grpc","time":"2023-03-14T17:30:51Z"}
{"error":"failed to get git client for repo git@git.internal.....git","grpc.code":"Unknown","grpc.method":"GenerateManifest","grpc.service":"repository.RepoServerService","grpc.start_time":"2023-03-14T17:30:50Z","grpc.time_ms":815.685,"level":"error","msg":"finished unary call with code Unknown","span.kind":"server","system":"grpc","time":"2023-03-14T17:30:51Z"}
@d-wierdsma
Copy link
Author

Just was able to regenerate this issue again when attempting an upgrade to v2.6.7 from v2.6.1. This time I did not see any failed to get git client for repo errors however I rolled back within 15 minutes so I'm guessing it just didn't have time to reach the git rate limit.
Screen Shot 2023-04-06 at 11 35 01 AM

Screen Shot 2023-04-06 at 11 37 47 AM

We can see from these images that CPU spikes almost immediately causing the HPA to scale up the number of repo-servers which compounds the issue.

image
As for logs, we can also see a distinct spike in logs at this time as well. I'm still investigating these logs to see if there is any apparent issues, but at first glance it looks like the following:

{"level":"info","msg":"manifest cache hit: \u0026ApplicationSource{RepoURL:https://chartmuseum.xxx.com,Path:,TargetRevision:0.2.4,Helm:\u0026ApplicationSourceHelm{ValueFiles:[$values/apps/test/application-values.yaml $values/apps/test/test-values.yaml $values/clusters/test/cluster-values.yaml],Parameters:[]HelmParameter{},ReleaseName:,Values:,FileParameters:[]HelmFileParameter{},Version:,PassCredentials:false,IgnoreMissingValueFiles:false,SkipCrds:false,},Kustomize:nil,Directory:nil,Plugin:nil,Chart:gitops,Ref:,}/0.2.4","time":"2023-04-06T15:17:41Z"}
{"grpc.code":"OK","grpc.method":"GenerateManifest","grpc.service":"repository.RepoServerService","grpc.start_time":"2023-04-06T15:17:41Z","grpc.time_ms":588.414,"level":"info","msg":"finished unary call with code OK","span.kind":"server","system":"grpc","time":"2023-04-06T15:17:41Z"}
{"level":"info","msg":"manifest cache miss: \u0026ApplicationSource{RepoURL:git@git.xxx.git,Path:,TargetRevision:HEAD,Helm:nil,Kustomize:nil,Directory:nil,Plugin:nil,Chart:,Ref:values,}/0a6daabbea0494097a41c0bcbacece4cb1908631","time":"2023-04-06T15:17:41Z"}
{"grpc.code":"OK","grpc.method":"GenerateManifest","grpc.service":"repository.RepoServerService","grpc.start_time":"2023-04-06T15:17:41Z","grpc.time_ms":3.614,"level":"info","msg":"finished unary call with code OK","span.kind":"server","system":"grpc","time":"2023-04-06T15:17:41Z"}
{"level":"info","msg":"manifest cache hit: \u0026ApplicationSource{RepoURL:https://chartmuseum.xxx.com,Path:,TargetRevision:0.2.4,Helm:\u0026ApplicationSourceHelm{ValueFiles:[$values/apps/shared-services/application-values.yaml $values/apps/shared-services/shared-services-values.yaml $values/clusters/shared-services/cluster-values.yaml],Parameters:[]HelmParameter{},ReleaseName:,Values:,FileParameters:[]HelmFileParameter{},Version:,PassCredentials:false,IgnoreMissingValueFiles:false,SkipCrds:false,},Kustomize:nil,Directory:nil,Plugin:nil,Chart:gitops,Ref:,}/0.2.4","time":"2023-04-06T15:17:41Z"}
{"grpc.code":"OK","grpc.method":"GenerateManifest","grpc.service":"repository.RepoServerService","grpc.start_time":"2023-04-06T15:17:41Z","grpc.time_ms":628.879,"level":"info","msg":"finished unary call with code OK","span.kind":"server","system":"grpc","time":"2023-04-06T15:17:41Z"}
{"level":"info","msg":"manifest cache miss: \u0026ApplicationSource{RepoURL:git@git.xxx.git,Path:,TargetRevision:HEAD,Helm:nil,Kustomize:nil,Directory:nil,Plugin:nil,Chart:,Ref:values,}/61355835ddae350ebe8c19e3ed49a426c574a464","time":"2023-04-06T15:17:41Z"}
{"grpc.code":"OK","grpc.method":"GenerateManifest","grpc.service":"repository.RepoServerService","grpc.start_time":"2023-04-06T15:17:41Z","grpc.time_ms":3.632,"level":"info","msg":"finished unary call with code OK","span.kind":"server","system":"grpc","time":"2023-04-06T15:17:41Z"}
{"level":"info","msg":"manifest cache hit: \u0026ApplicationSource{RepoURL:https://chartmuseum.xxx.com,Path:,TargetRevision:0.2.4,Helm:\u0026ApplicationSourceHelm{ValueFiles:[$values/apps/dev/application-values.yaml $values/apps/dev/dev-values.yaml $values/clusters/dev/cluster-values.yaml],Parameters:[]HelmParameter{},ReleaseName:,Values:,FileParameters:[]HelmFileParameter{},Version:,PassCredentials:false,IgnoreMissingValueFiles:false,SkipCrds:false,},Kustomize:nil,Directory:nil,Plugin:nil,Chart:gitops,Ref:,}/0.2.4","time":"2023-04-06T15:17:41Z"}
{"grpc.code":"OK","grpc.method":"GenerateManifest","grpc.service":"repository.RepoServerService","grpc.start_time":"2023-04-06T15:17:41Z","grpc.time_ms":635.541,"level":"info","msg":"finished unary call with code OK","span.kind":"server","system":"grpc","time":"2023-04-06T15:17:41Z"}

@d-wierdsma
Copy link
Author

I've also just verified that our gitlab instance has no authenticated API request rate limits set, as we are using SSH creds I assume this is how Repo Server is making requests.

@d-wierdsma d-wierdsma changed the title Repo Server hammering git instance when no changes are present Repo Server high number of git requests when no changes are requested Apr 6, 2023
@d-wierdsma
Copy link
Author

The interesting part of this to me is that repo Server seems to be pulling all repos on startup, even though we have disabled automatic sync and only intend to trigger Syncs from Webhooks themselves. Entirely possible I'm misunderstanding the Repo Server startup process though

@r0bj
Copy link

r0bj commented Apr 25, 2023

I experienced similar issue, argocd multi-source Applications stayed in Unknown state until manually refreshed. I also noticed some github rate limiting during this issue.

@d-wierdsma
Copy link
Author

I turned on ApplicationSet and Application Controller Debug logs and started to see that there were a ton of reconciliation loops being created by the Application Controller due to Orphaned resources.

I had set orphanedResources tag on all my ArgoCD Projects that made my applications attempt to claim ownership of all orphaned resources within its namespace that the application is deployed to.

spec:
  description: Argocd Project
  orphanedResources:
    warn: false

Here is the difference in reconciliation and git ls-remote calls.
image
image

There is some reconciliation loops still in place that I need to investigate, but it's significantly better now.

Ref: #8100 (comment)

@andrleite
Copy link

andrleite commented Jun 27, 2023

Hello, Any updates on this?
We've tried to upgrade from version 2.6.2 to 2.6.11 and Git Requests and Reconciliation start increasing immediately. After rolling it back it decreases significantly.
image
image

@andrleite
Copy link

@crenshaw-dev Did you see sth similar? I saw you asked us to create a separate issue. Please could you help us with it?

@andrleite
Copy link

andrleite commented Jun 30, 2023

I've noticed the issue starts at version 2.6.3 and an endless loop of reconciliation happening to applications that have recurse: true

@rayleshh
Copy link

Same Here!

@crenshaw-dev
Copy link
Member

Is everyone here using ApplicationSets? I suspect the issue might be related to the ApplicationSet controller failing to normalize the App spec before applying it. The Application controller and the ApplicationSet controller end up fighting over the correct App manifest, resulting in constant reconciliation.

I've merged a fix: #14481

@andrleite
Copy link

Yes, we're using ApplicationSets as the other guys mentioned #14712. I've tried version 2.7.10 with no success.

@stafot
Copy link

stafot commented Aug 3, 2023

Relevant comments to this. Adding them here for reference: #14712 (comment), #14712 (comment), #14712 (comment), #14712 (comment), #14712 (comment), #14712 (comment)

@crenshaw-dev
Copy link
Member

@stafot do we know for certain yet that the appset controller is involved at all in the high request count in your env? What happens if you scale down the controller for a few minutes?

I'm a little suspicious that multi-source apps might be to blame in your case: #14725

@andrleite
Copy link

@crenshaw-dev We did the test, after upgrading Argocd and scaling down the appset controller to zero the reconciliation activity kept increasing.

@andrleite
Copy link

@crenshaw-dev Do you think the case: #14725 is related? I was reading the recent messages that seem very similar to our problem, we're using the app-of-apps pattern with multi-source apps.

@spirosoik
Copy link
Contributor

spirosoik commented Jan 10, 2024

@crenshaw-dev I am wondering if there will be any actions on this. It is happening for several months, it has been mentioned by several and seems that there's no really activity on this issue.

It's a pity that we cannot even upgrade to latest versions of ArgoCD and catchup with security vulnerabilities and latest improvements. Is there any other workaround?

@andrleite
Copy link

@nromriell We're following your amazing work in this issue where two parts has already merged. We believe our issue is related and we want to share some results after upgrading to 2.9.5.
We're stuck in version 2.6.2 since the bug was introduced so we upgraded from it. Before your changes, the reconciliation and git requests started to increase non-stop, now it is high but stable like the graphics below:

image
image

We've ~150 apps with multi-source.

Do you believe is this expected until the merge of the third and fourth parts of the issue?

Thanks.

@nromriell
Copy link
Contributor

Hi @andrleite as of the last state I would expect the checkouts at least to be lower

My changes are primarily around fixing the number of git requests between cache invalidations. Looking at the graphs you shared here it looks like your cache is nearly constantly invalidated which looks like the primary issue and likely why you aren't seeing the behavior you'd expect. Have you tried setting timeout.reconciliation to something very high like 24 hours rather than 0 to compare?

My time has been pretty limited lately so I haven't been able to continue to make improvements here but should at least be able to look at opening up the remaining two PRs here shortly. I think as is though those probably won't fix what you're seeing since they rely on the items being cached, it would just reduce the call count per cycle.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

8 participants