Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default route not set #23

Closed
ertimas opened this issue Sep 13, 2024 · 14 comments
Closed

Default route not set #23

ertimas opened this issue Sep 13, 2024 · 14 comments

Comments

@ertimas
Copy link

ertimas commented Sep 13, 2024

Hi Andy,

I am running into a missing default ip route, which the k3s team solves via a dummy in this issue: k3s-io/k3s#1144.
image

It seems like you would have run into that? Did you have this configured prior to the script, or did I miss something?

Thanks for your time!

@clemenko
Copy link
Owner

I have not seen that before. Let's look at the basics.
What OS ? cat /etc/os-release
What nics ? ip a
did you modify the config.yaml?

@ertimas
Copy link
Author

ertimas commented Sep 16, 2024

Hi Andy,
Thank you for getting back to me.

OS

Oracle Linux 8.10 server

NICS

image

Config

I didn't modify config.yaml, though I also didn't have the file in /etc/rancher/rke2/. So I symlinked one to the existing /etc/rancher/rke2/rke2.yaml. Which got through that issue. Though I'm left with a new one....

image

hauler.repo is in fileserver/.
image

Here's the output from netstat
image

Any thought on how to troubleshoot fileserver not seeing hauler.repo would be much appreciated. Note: my hauler directory is has global read/write permissions.

@clemenko
Copy link
Owner

the two issues may be related.
A. routes
B. hauler.

Can you try a clean install of Oracle 8 and see if any routes are there? /proc/net/route specifically?

@ertimas
Copy link
Author

ertimas commented Sep 16, 2024

Here's /proc/net/route. Note I've got a second interface attached so that I can get the VM setup before "airgapping" it

image

@clemenko
Copy link
Owner

I wonder is the error was an isolated issue. Can you try and install it again?

Wait, are you disabling the NIC when installing the stack? Kubernetes needs an NIC at all times.

@ertimas
Copy link
Author

ertimas commented Sep 16, 2024

Last time I left both NICs on during installation. I've tried it a couple times with each/both NICs, no ip given to hauler_all_the_things.sh, still the same issue

@clemenko
Copy link
Owner

there is no IP needed for the control function. can you run the control command with bash -x ./hauler_all_the_things.sh control ?

@ertimas
Copy link
Author

ertimas commented Sep 16, 2024

I found that if I didn't provide an IP then it might look at the second NIC, and it looks like this made it happen.

Here is the result of bash -x ./hauler_all_the_things.sh control

image

Here's the log where createrepo was run
image

@clemenko
Copy link
Owner

Can you see if the hauler.repo file is there : curl -sfL http://10.237.13.41:8080/hauler.repo?

@ertimas
Copy link
Author

ertimas commented Sep 17, 2024

I can see the file, but it's inaccessible if running the hauler server, see the yellow box. Note: if I run python3 http.server I can access the file from another box, which leads me to believe basic networking and file permissions are fine.

Screenshot 2024-09-17 at 09 24 12

@clemenko
Copy link
Owner

I wonder is Hauler is not able to server anything on a node with 2 nics.
What does ss -tln show?

@ertimas
Copy link
Author

ertimas commented Sep 17, 2024

It's shows the same thing as netstat -lnt essentially.

Manually running hauler store serve fileserver <my store directory> did work. It took about a minute for it to come up. Here are the debug logs in case they're helpful. Marking this as closed

image

@ertimas ertimas closed this as completed Sep 17, 2024
@clemenko
Copy link
Owner

huh. I see you closed the issue. Is it working now?

@ertimas
Copy link
Author

ertimas commented Sep 17, 2024

TLDR; I couldn't get your script to work.

There were two issues.

  1. The fileserver service looks like it's fine, but after several minutes fails to be curlable. So, I ran hauler store serve -l debug fileserver <my hauler directory> and it worked. The command showed the fileserver taking >1minute for the to come up. See the screenshot I posted....
  2. The registry fails to load. This is due to the a longhorn-csi component in the hauler store .zst being corrupted during tar/untar. Removing the csi components from airgap_hauler.yaml fixed it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants