From e5d2aec26c8692531cdde713519bf9788ebc1b51 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Marc=20Sch=C3=B6chlin?= Date: Mon, 16 Sep 2024 15:24:16 +0200 Subject: [PATCH] add preinstallation MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Signed-off-by: Marc Schöchlin --- .../concept-guide/preinstall-checklist.md | 116 ++++++++++++++++++ 1 file changed, 116 insertions(+) create mode 100644 docs/guides/concept-guide/preinstall-checklist.md diff --git a/docs/guides/concept-guide/preinstall-checklist.md b/docs/guides/concept-guide/preinstall-checklist.md new file mode 100644 index 0000000000..0257f0099c --- /dev/null +++ b/docs/guides/concept-guide/preinstall-checklist.md @@ -0,0 +1,116 @@ +--- +sidebar_label: Pre Installation Checklist +sidebar_position: 49 +--- + +# Pre Installation Checklist + +:::warning + +This checklist is currently work in progress and incomplete. + +It is imperative that the following topics are clarified and the described resources are available before performing +the initial installation. +::: + +This list describes some aspects (without claiming to be exhaustive) that should be clarified before a pilot and at least before production installation. +The aim of this list is to reduce waiting times, unsuccessful attempts, errors and major adjustment work in the +installation process itself as well as in subsequent operation. + +## Network configuration of nodes and tenant networks + +TBD: + +* It must be decided how the networks of the tenants should be separated in Openstack (Neutron) +* It must be decided how the underlay network of the cloud platform should be designed. + (e.g. native Layer2, Layer2 underlay with Tenant VLANs, Layer3 underlay) +* Layer 3 Underlay + * FRR Routing on the Nodes? + * ASN nameing scheme + +## Hardware sizing of the plattform + + +## Required IP Networks + +Estimate the expected number of IP addresses and plan sufficient reserves so that no adjustments to the networks will be necessary at a later date. +The installation can be carried out via IPv4 or IPv6 as well as hybrid. + + * Frontend Access: A dedicated IP adress space / network for services published by the cloud platform and its users + * this is in most cases a public IPv4 network + * at least TCP port 443 should be accessible for all adresses of this network from other networks + * Node Communication: A dedicated private IP adress space / network for the internal communication between the nodes + * every node needs a dedicated IP + * a DHCP range for installation might be useful, but not mandatory + * all nodes in this network should have access to the NTP server + * all nodes should have access to public DNS servers and HTTP/HTTPS servers + * In some cases, it may make sense to operate Ceph in a dedicated network or multiple dedicated networks (public, cluster). + Methods for high-performance and scalable access to the storage: + * very high-performance routing (layer 3), for example via switch infrastructure + * Dedicated network adapters in the compute nodes for direct access to the storage network + * Management: A private IP adress space / network for the hardware out of out band management of the nodes + * every node needs a dedicated management IP + * a DHCP range for installation might be useful, but not mandatory + * Manager Access: Dedicated IP adresses for the access of the manager nodes + * Every manager gets a dedicated external address for SSH and Wireguard Access + * The IP adresses should not be part of the "Frontend Access" network + * At least Port 443/TCP and 51820/UDP should be reachable from external networks + +## Domains and Hosts + + * Cloud Domain: A dedicated subdomain used for the cloud environment + (i.e. `*.zone1.landscape.scs.community`) + * Internal API endpoint: A hostname for the internal api endpoint which points to address to the "Node Communication" network + (i.e. `api-internal.zone1.landscape.scs.community`) + * External API endpoint: A hostname for the external api endpoint which points to address to the "Frontend Access" network + (i.e. `api.zone1.landscape.scs.community`) + +## TLS Certificates + +Since not all domains that are used for the environment will be publicly accessible and therefore the use of “Let's Encrypt” certificates +is not generally possible without problems, we recommend that official TLS certificates are available for at least the two API endpoints. +Either a multi-domain certificate (with SANs) or a wildcard certificate (wildcard on the first level of the cloud domain) can be used for this. + +## Access to installation resources. + +For the download of installation data such as container images, operating system packages, etc., +either access to publicly accessible networks must be provided or a caching proxy or a dedicated +repository server must be provided directly from the network for “Node communication”. + +The [Configuration Guide](https://docs.scs.community/docs/iaas/guides/configuration-guide/proxy) provides more detailed information on how this can be configured. + +## NTP Infrastructure + + * The deployed nodes should have permanent access to at least 3 ntp servers + * It has turned out to be advantageous that the 3 control nodes have access to NTP servers + and provide NTP servers for the other nodes of the SCS installation. + * The NTP servers used, should not run on virtual hardware + (Depending on the architecture and the virtualization platform, this can otherwise cause minor or major problems in special situations.) + +## Ceph Storage + +### General + +TBD: +* Crush / Failure domain properies +* Amount of usable storage +* External Ceph storage installation +* Dedicated ceph nodes or hyperconverged setup? + +### Disk Storage + +* What use cases can be expected and on what scale? + +### Object Storage + +* Rados Gateway Setup + +## Miscellanious Topics + +* Decide which base operating system is used (e.g. RHEL or Ubuntu) and whether this fits the hardware support, strategy, upgrade support and cost structure. +* A private Git Repository for the [configuration repository](https://osism.tech/docs/guides/configuration-guide/configuration-repository) +* The public Keys of all administrators +* Connection and integration into existing operational monitoring. + + +