diff --git a/cluster/jupyterhub/config.yaml b/cluster/jupyterhub/config.yaml new file mode 100644 index 0000000..afcc429 --- /dev/null +++ b/cluster/jupyterhub/config.yaml @@ -0,0 +1,720 @@ +# fullnameOverride and nameOverride distinguishes blank strings, null values, +# and non-blank strings. For more details, see the configuration reference. +fullnameOverride: "" +nameOverride: + +# enabled is ignored by the jupyterhub chart itself, but a chart depending on +# the jupyterhub chart conditionally can make use this config option as the +# condition. +enabled: + +# custom can contain anything you want to pass to the hub pod, as all passed +# Helm template values will be made available there. +custom: {} + +# imagePullSecret is configuration to create a k8s Secret that Helm chart's pods +# can get credentials from to pull their images. +imagePullSecret: + create: false + automaticReferenceInjection: true + registry: + username: + password: + email: +# imagePullSecrets is configuration to reference the k8s Secret resources the +# Helm chart's pods can get credentials from to pull their images. +imagePullSecrets: [] + +# hub relates to the hub pod, responsible for running JupyterHub, its configured +# Authenticator class KubeSpawner, and its configured Proxy class +# ConfigurableHTTPProxy. KubeSpawner creates the user pods, and +# ConfigurableHTTPProxy speaks with the actual ConfigurableHTTPProxy server in +# the proxy pod. +hub: + revisionHistoryLimit: + config: + JupyterHub: + admin_access: true + authenticator_class: dummy + service: + type: ClusterIP + annotations: {} + ports: + nodePort: + extraPorts: [] + loadBalancerIP: + baseUrl: / + cookieSecret: + initContainers: [] + nodeSelector: {} + tolerations: [] + concurrentSpawnLimit: 64 + consecutiveFailureLimit: 5 + activeServerLimit: + deploymentStrategy: + ## type: Recreate + ## - sqlite-pvc backed hubs require the Recreate deployment strategy as a + ## typical PVC storage can only be bound to one pod at the time. + ## - JupyterHub isn't designed to support being run in parallell. More work + ## needs to be done in JupyterHub itself for a fully highly available (HA) + ## deployment of JupyterHub on k8s is to be possible. + type: Recreate + db: + type: sqlite-pvc + upgrade: + pvc: + annotations: {} + selector: {} + accessModes: + - ReadWriteOnce + storage: 1Gi + subPath: + storageClassName: + url: + password: + labels: {} + annotations: {} + command: [] + args: [] + extraConfig: {} + extraFiles: {} + extraEnv: {} + extraContainers: [] + extraVolumes: [] + extraVolumeMounts: [] + image: + name: quay.io/jupyterhub/k8s-hub + tag: "set-by-chartpress" + pullPolicy: + pullSecrets: [] + resources: {} + podSecurityContext: + runAsNonRoot: true + fsGroup: 1000 + seccompProfile: + type: "RuntimeDefault" + containerSecurityContext: + runAsUser: 1000 + runAsGroup: 1000 + allowPrivilegeEscalation: false + capabilities: + drop: ["ALL"] + lifecycle: {} + loadRoles: {} + services: {} + pdb: + enabled: false + maxUnavailable: + minAvailable: 1 + networkPolicy: + enabled: true + ingress: [] + egress: [] + egressAllowRules: + cloudMetadataServer: true + dnsPortsCloudMetadataServer: true + dnsPortsKubeSystemNamespace: true + dnsPortsPrivateIPs: true + nonPrivateIPs: true + privateIPs: true + interNamespaceAccessLabels: ignore + allowedIngressPorts: [] + allowNamedServers: false + namedServerLimitPerUser: + authenticatePrometheus: + redirectToServer: + shutdownOnLogout: + templatePaths: [] + templateVars: {} + livenessProbe: + # The livenessProbe's aim to give JupyterHub sufficient time to startup but + # be able to restart if it becomes unresponsive for ~5 min. + enabled: true + initialDelaySeconds: 300 + periodSeconds: 10 + failureThreshold: 30 + timeoutSeconds: 3 + readinessProbe: + # The readinessProbe's aim is to provide a successful startup indication, + # but following that never become unready before its livenessProbe fail and + # restarts it if needed. To become unready following startup serves no + # purpose as there are no other pod to fallback to in our non-HA deployment. + enabled: true + initialDelaySeconds: 0 + periodSeconds: 2 + failureThreshold: 1000 + timeoutSeconds: 1 + existingSecret: + serviceAccount: + create: true + name: + annotations: {} + extraPodSpec: {} + +rbac: + create: true + +# proxy relates to the proxy pod, the proxy-public service, and the autohttps +# pod and proxy-http service. +proxy: + secretToken: + annotations: {} + deploymentStrategy: + ## type: Recreate + ## - JupyterHub's interaction with the CHP proxy becomes a lot more robust + ## with this configuration. To understand this, consider that JupyterHub + ## during startup will interact a lot with the k8s service to reach a + ## ready proxy pod. If the hub pod during a helm upgrade is restarting + ## directly while the proxy pod is making a rolling upgrade, the hub pod + ## could end up running a sequence of interactions with the old proxy pod + ## and finishing up the sequence of interactions with the new proxy pod. + ## As CHP proxy pods carry individual state this is very error prone. One + ## outcome when not using Recreate as a strategy has been that user pods + ## have been deleted by the hub pod because it considered them unreachable + ## as it only configured the old proxy pod but not the new before trying + ## to reach them. + type: Recreate + ## rollingUpdate: + ## - WARNING: + ## This is required to be set explicitly blank! Without it being + ## explicitly blank, k8s will let eventual old values under rollingUpdate + ## remain and then the Deployment becomes invalid and a helm upgrade would + ## fail with an error like this: + ## + ## UPGRADE FAILED + ## Error: Deployment.apps "proxy" is invalid: spec.strategy.rollingUpdate: Forbidden: may not be specified when strategy `type` is 'Recreate' + ## Error: UPGRADE FAILED: Deployment.apps "proxy" is invalid: spec.strategy.rollingUpdate: Forbidden: may not be specified when strategy `type` is 'Recreate' + rollingUpdate: + # service relates to the proxy-public service + service: + type: LoadBalancer + labels: {} + annotations: {} + nodePorts: + http: + https: + disableHttpPort: false + extraPorts: [] + loadBalancerIP: + loadBalancerSourceRanges: [] + # chp relates to the proxy pod, which is responsible for routing traffic based + # on dynamic configuration sent from JupyterHub to CHP's REST API. + chp: + revisionHistoryLimit: + containerSecurityContext: + runAsNonRoot: true + runAsUser: 65534 # nobody user + runAsGroup: 65534 # nobody group + allowPrivilegeEscalation: false + capabilities: + drop: ["ALL"] + seccompProfile: + type: "RuntimeDefault" + image: + name: quay.io/jupyterhub/configurable-http-proxy + # tag is automatically bumped to new patch versions by the + # watch-dependencies.yaml workflow. + # + tag: "4.6.2" # https://github.com/jupyterhub/configurable-http-proxy/tags + pullPolicy: + pullSecrets: [] + extraCommandLineFlags: [] + livenessProbe: + enabled: true + initialDelaySeconds: 60 + periodSeconds: 10 + failureThreshold: 30 + timeoutSeconds: 3 + readinessProbe: + enabled: true + initialDelaySeconds: 0 + periodSeconds: 2 + failureThreshold: 1000 + timeoutSeconds: 1 + resources: {} + defaultTarget: + errorTarget: + extraEnv: {} + nodeSelector: {} + tolerations: [] + networkPolicy: + enabled: true + ingress: [] + egress: [] + egressAllowRules: + cloudMetadataServer: true + dnsPortsCloudMetadataServer: true + dnsPortsKubeSystemNamespace: true + dnsPortsPrivateIPs: true + nonPrivateIPs: true + privateIPs: true + interNamespaceAccessLabels: ignore + allowedIngressPorts: [http, https] + pdb: + enabled: false + maxUnavailable: + minAvailable: 1 + extraPodSpec: {} + # traefik relates to the autohttps pod, which is responsible for TLS + # termination when proxy.https.type=letsencrypt. + traefik: + revisionHistoryLimit: + containerSecurityContext: + runAsNonRoot: true + runAsUser: 65534 # nobody user + runAsGroup: 65534 # nobody group + allowPrivilegeEscalation: false + capabilities: + drop: ["ALL"] + seccompProfile: + type: "RuntimeDefault" + image: + name: traefik + # tag is automatically bumped to new patch versions by the + # watch-dependencies.yaml workflow. + # + tag: "v3.1.2" # ref: https://hub.docker.com/_/traefik?tab=tags + pullPolicy: + pullSecrets: [] + hsts: + includeSubdomains: false + preload: false + maxAge: 15724800 # About 6 months + resources: {} + labels: {} + extraInitContainers: [] + extraEnv: {} + extraVolumes: [] + extraVolumeMounts: [] + extraStaticConfig: {} + extraDynamicConfig: {} + nodeSelector: {} + tolerations: [] + extraPorts: [] + networkPolicy: + enabled: true + ingress: [] + egress: [] + egressAllowRules: + cloudMetadataServer: true + dnsPortsCloudMetadataServer: true + dnsPortsKubeSystemNamespace: true + dnsPortsPrivateIPs: true + nonPrivateIPs: true + privateIPs: true + interNamespaceAccessLabels: ignore + allowedIngressPorts: [http, https] + pdb: + enabled: false + maxUnavailable: + minAvailable: 1 + serviceAccount: + create: true + name: + annotations: {} + extraPodSpec: {} + secretSync: + containerSecurityContext: + runAsNonRoot: true + runAsUser: 65534 # nobody user + runAsGroup: 65534 # nobody group + allowPrivilegeEscalation: false + capabilities: + drop: ["ALL"] + seccompProfile: + type: "RuntimeDefault" + image: + name: quay.io/jupyterhub/k8s-secret-sync + tag: "set-by-chartpress" + pullPolicy: + pullSecrets: [] + resources: {} + labels: {} + https: + enabled: false + type: letsencrypt + #type: letsencrypt, manual, offload, secret + letsencrypt: + contactEmail: + # Specify custom server here (https://acme-staging-v02.api.letsencrypt.org/directory) to hit staging LE + acmeServer: https://acme-v02.api.letsencrypt.org/directory + manual: + key: + cert: + secret: + name: + key: tls.key + crt: tls.crt + hosts: [] + +# singleuser relates to the configuration of KubeSpawner which runs in the hub +# pod, and its spawning of user pods such as jupyter-myusername. +# singleuser: +# podNameTemplate: +# extraTolerations: [] +# nodeSelector: {} +# extraNodeAffinity: +# required: [] +# preferred: [] +# extraPodAffinity: +# required: [] +# preferred: [] +# extraPodAntiAffinity: +# required: [] +# preferred: [] +# networkTools: +# image: +# name: quay.io/jupyterhub/k8s-network-tools +# tag: "set-by-chartpress" +# pullPolicy: +# pullSecrets: [] +# resources: {} +# cloudMetadata: +# # block set to true will append a privileged initContainer using the +# # iptables to block the sensitive metadata server at the provided ip. +# blockWithIptables: true +# ip: 169.254.169.254 +# networkPolicy: +# enabled: true +# ingress: [] +# egress: [] +# egressAllowRules: +# cloudMetadataServer: false +# dnsPortsCloudMetadataServer: true +# dnsPortsKubeSystemNamespace: true +# dnsPortsPrivateIPs: true +# nonPrivateIPs: true +# privateIPs: false +# interNamespaceAccessLabels: ignore +# allowedIngressPorts: [] +# events: true +# extraAnnotations: {} +# extraLabels: +# hub.jupyter.org/network-access-hub: "true" +# extraFiles: {} +# extraEnv: {} +# lifecycleHooks: {} +# initContainers: [] +# extraContainers: [] +# allowPrivilegeEscalation: false +# uid: 1000 +# fsGid: 100 +# serviceAccountName: +# storage: +# type: dynamic +# extraLabels: {} +# extraVolumes: [] +# extraVolumeMounts: [] +# static: +# pvcName: +# subPath: "{username}" +# capacity: 10Gi +# homeMountPath: /home/jovyan +# dynamic: +# storageClass: +# pvcNameTemplate: claim-{username}{servername} +# volumeNameTemplate: volume-{username}{servername} +# storageAccessModes: [ReadWriteOnce] +# subPath: +# image: +# name: quay.io/jupyterhub/k8s-singleuser-sample +# tag: "set-by-chartpress" +# pullPolicy: +# pullSecrets: [] +# startTimeout: 300 +# cpu: +# limit: +# guarantee: +# memory: +# limit: +# guarantee: 1G +# extraResource: +# limits: {} +# guarantees: {} +# cmd: jupyterhub-singleuser +# defaultUrl: +# extraPodConfig: {} +# profileList: [] + +# scheduling relates to the user-scheduler pods and user-placeholder pods. +scheduling: + userScheduler: + enabled: true + revisionHistoryLimit: + replicas: 2 + logLevel: 4 + # plugins are configured on the user-scheduler to make us score how we + # schedule user pods in a way to help us schedule on the most busy node. By + # doing this, we help scale down more effectively. It isn't obvious how to + # enable/disable scoring plugins, and configure them, to accomplish this. + # + # plugins ref: https://kubernetes.io/docs/reference/scheduling/config/#scheduling-plugins-1 + # migration ref: https://kubernetes.io/docs/reference/scheduling/config/#scheduler-configuration-migrations + # + plugins: + score: + # These scoring plugins are enabled by default according to + # https://kubernetes.io/docs/reference/scheduling/config/#scheduling-plugins + # 2022-02-22. + # + # Enabled with high priority: + # - NodeAffinity + # - InterPodAffinity + # - NodeResourcesFit + # - ImageLocality + # Remains enabled with low default priority: + # - TaintToleration + # - PodTopologySpread + # - VolumeBinding + # Disabled for scoring: + # - NodeResourcesBalancedAllocation + # + disabled: + # We disable these plugins (with regards to scoring) to not interfere + # or complicate our use of NodeResourcesFit. + - name: NodeResourcesBalancedAllocation + # Disable plugins to be allowed to enable them again with a different + # weight and avoid an error. + - name: NodeAffinity + - name: InterPodAffinity + - name: NodeResourcesFit + - name: ImageLocality + enabled: + - name: NodeAffinity + weight: 14631 + - name: InterPodAffinity + weight: 1331 + - name: NodeResourcesFit + weight: 121 + - name: ImageLocality + weight: 11 + pluginConfig: + # Here we declare that we should optimize pods to fit based on a + # MostAllocated strategy instead of the default LeastAllocated. + - name: NodeResourcesFit + args: + scoringStrategy: + resources: + - name: cpu + weight: 1 + - name: memory + weight: 1 + type: MostAllocated + containerSecurityContext: + runAsNonRoot: true + runAsUser: 65534 # nobody user + runAsGroup: 65534 # nobody group + allowPrivilegeEscalation: false + capabilities: + drop: ["ALL"] + seccompProfile: + type: "RuntimeDefault" + image: + # IMPORTANT: Bumping the minor version of this binary should go hand in + # hand with an inspection of the user-scheduelr's RBAC + # resources that we have forked in + # templates/scheduling/user-scheduler/rbac.yaml. + # + # Debugging advice: + # + # - Is configuration of kube-scheduler broken in + # templates/scheduling/user-scheduler/configmap.yaml? + # + # - Is the kube-scheduler binary's compatibility to work + # against a k8s api-server that is too new or too old? + # + # - You can update the GitHub workflow that runs tests to + # include "deploy/user-scheduler" in the k8s namespace report + # and reduce the user-scheduler deployments replicas to 1 in + # dev-config.yaml to get relevant logs from the user-scheduler + # pods. Inspect the "Kubernetes namespace report" action! + # + # - Typical failures are that kube-scheduler fails to search for + # resources via its "informers", and won't start trying to + # schedule pods before they succeed which may require + # additional RBAC permissions or that the k8s api-server is + # aware of the resources. + # + # - If "successfully acquired lease" can be seen in the logs, it + # is a good sign kube-scheduler is ready to schedule pods. + # + name: registry.k8s.io/kube-scheduler + # tag is automatically bumped to new patch versions by the + # watch-dependencies.yaml workflow. The minor version is pinned in the + # workflow, and should be updated there if a minor version bump is done + # here. We aim to stay around 1 minor version behind the latest k8s + # version. + # + tag: "v1.28.13" # ref: https://github.com/kubernetes/kubernetes/tree/master/CHANGELOG + pullPolicy: + pullSecrets: [] + nodeSelector: {} + tolerations: [] + labels: {} + annotations: {} + pdb: + enabled: true + maxUnavailable: 1 + minAvailable: + resources: {} + serviceAccount: + create: true + name: + annotations: {} + extraPodSpec: {} + podPriority: + enabled: false + globalDefault: false + defaultPriority: 0 + imagePullerPriority: -5 + userPlaceholderPriority: -10 + userPlaceholder: + enabled: true + image: + name: registry.k8s.io/pause + # tag is automatically bumped to new patch versions by the + # watch-dependencies.yaml workflow. + # + # If you update this, also update prePuller.pause.image.tag + # + tag: "3.10" + pullPolicy: + pullSecrets: [] + revisionHistoryLimit: + replicas: 0 + labels: {} + annotations: {} + containerSecurityContext: + runAsNonRoot: true + runAsUser: 65534 # nobody user + runAsGroup: 65534 # nobody group + allowPrivilegeEscalation: false + capabilities: + drop: ["ALL"] + seccompProfile: + type: "RuntimeDefault" + resources: {} + corePods: + tolerations: + - key: hub.jupyter.org/dedicated + operator: Equal + value: core + effect: NoSchedule + - key: hub.jupyter.org_dedicated + operator: Equal + value: core + effect: NoSchedule + nodeAffinity: + matchNodePurpose: prefer + userPods: + tolerations: + - key: hub.jupyter.org/dedicated + operator: Equal + value: user + effect: NoSchedule + - key: hub.jupyter.org_dedicated + operator: Equal + value: user + effect: NoSchedule + nodeAffinity: + matchNodePurpose: prefer + +# prePuller relates to the hook|continuous-image-puller DaemonsSets +prePuller: + revisionHistoryLimit: + labels: {} + annotations: {} + resources: {} + containerSecurityContext: + runAsNonRoot: true + runAsUser: 65534 # nobody user + runAsGroup: 65534 # nobody group + allowPrivilegeEscalation: false + capabilities: + drop: ["ALL"] + seccompProfile: + type: "RuntimeDefault" + extraTolerations: [] + # hook relates to the hook-image-awaiter Job and hook-image-puller DaemonSet + hook: + enabled: true + pullOnlyOnChanges: true + # image and the configuration below relates to the hook-image-awaiter Job + image: + name: quay.io/jupyterhub/k8s-image-awaiter + tag: "set-by-chartpress" + pullPolicy: + pullSecrets: [] + containerSecurityContext: + runAsNonRoot: true + runAsUser: 65534 # nobody user + runAsGroup: 65534 # nobody group + allowPrivilegeEscalation: false + capabilities: + drop: ["ALL"] + seccompProfile: + type: "RuntimeDefault" + podSchedulingWaitDuration: 10 + nodeSelector: {} + tolerations: [] + resources: {} + serviceAccount: + create: true + name: + annotations: {} + continuous: + enabled: true + pullProfileListImages: true + extraImages: {} + pause: + containerSecurityContext: + runAsNonRoot: true + runAsUser: 65534 # nobody user + runAsGroup: 65534 # nobody group + allowPrivilegeEscalation: false + capabilities: + drop: ["ALL"] + seccompProfile: + type: "RuntimeDefault" + image: + name: registry.k8s.io/pause + # tag is automatically bumped to new patch versions by the + # watch-dependencies.yaml workflow. + # + # If you update this, also update scheduling.userPlaceholder.image.tag + # + tag: "3.10" + pullPolicy: + pullSecrets: [] + +ingress: + enabled: false + annotations: {} + ingressClassName: + hosts: [] + pathSuffix: + pathType: Prefix + tls: [] + +# cull relates to the jupyterhub-idle-culler service, responsible for evicting +# inactive singleuser pods. +# +# The configuration below, except for enabled, corresponds to command-line flags +# for jupyterhub-idle-culler as documented here: +# https://github.com/jupyterhub/jupyterhub-idle-culler#as-a-standalone-script +# +cull: + enabled: true + users: false # --cull-users + adminUsers: true # --cull-admin-users + removeNamedServers: false # --remove-named-servers + timeout: 3600 # --timeout + every: 600 # --cull-every + concurrency: 10 # --concurrency + maxAge: 0 # --max-age + +debug: + enabled: false + +global: + safeToShowValues: false \ No newline at end of file diff --git a/cluster/jupyterhub/readme.md b/cluster/jupyterhub/readme.md new file mode 100644 index 0000000..8e3380d --- /dev/null +++ b/cluster/jupyterhub/readme.md @@ -0,0 +1,72 @@ +# Installing JupyterHub +JupyterHub is a multi-user Hub that spawns, manages, and proxies multiple instances of the single-user Jupyter notebook server. + +## Initialize a Helm chart configuration file +Helm charts’ contain templates that can be rendered to the Kubernetes resources to be installed. A user of a Helm chart can override the chart’s default values to influence how the templates render. + +In this step we will initialize a chart configuration file for you to adjust your installation of JupyterHub. We will name and refer to it as config.yaml going onwards. + +```bash +# This file can update the JupyterHub Helm chart's default configuration values. +# +# For reference see the configuration reference and default values, but make +# sure to refer to the Helm chart version of interest to you! +# +# Introduction to YAML: https://www.youtube.com/watch?v=cdLNKUoMc6c +# Chart config reference: https://zero-to-jupyterhub.readthedocs.io/en/stable/resources/reference.html +# Chart default values: https://github.com/jupyterhub/zero-to-jupyterhub-k8s/blob/HEAD/jupyterhub/values.yaml +# Available chart versions: https://hub.jupyter.org/helm-chart/ +# +``` + +## Install JupyterHub +This is a simple jupyterhub deployment using helm. +```bash +helm repo add jupyterhub https://jupyterhub.github.io/helm-chart/ +helm repo update +``` +Now install the chart configured by your `config.yaml` by running this command from the directory that contains your `config.yaml`: +```bash +helm upgrade --cleanup-on-fail \ + --install jupyterhub/jupyterhub \ + --namespace \ + --create-namespace \ + --version= \ + --values config.yaml +``` +where: + +- refers to a Helm release name, an identifier used to differentiate chart installations. You need it when you are changing or deleting the configuration of this chart installation. If your Kubernetes cluster will contain multiple JupyterHubs make sure to differentiate them. You can list your Helm releases with helm list. + +- refers to a Kubernetes namespace, an identifier used to group Kubernetes resources, in this case all Kubernetes resources associated with the JupyterHub chart. You’ll need the namespace identifier for performing any commands with kubectl. + +- This step may take a moment, during which time there will be no output to your terminal. JupyterHub is being installed in the background. + +- If you get a release named already exists error, then you should delete the release by running helm delete . Then reinstall by repeating this step. If it persists, also do kubectl delete namespace and try again. + +- In general, if something goes wrong with the install step, delete the Helm release by running helm delete before re-running the install command. + +- If you’re pulling from a large Docker image you may get a Error: timed out waiting for the condition error, add a --timeout=m parameter to the helm command. + +- The --version parameter corresponds to the version of the Helm chart, not the version of JupyterHub. Each version of the JupyterHub Helm chart is paired with a specific version of JupyterHub. E.g., 0.11.1 of the Helm chart runs JupyterHub 1.3.0. For a list of which JupyterHub version is installed in each version of the JupyterHub Helm Chart, see the Helm Chart repository. + +```bash +helm upgrade --cleanup-on-fail \ + --install jupyterhub jupyterhub/jupyterhub \ + --namespace jupyterhub \ + --create-namespace \ + --values config.yaml +``` + +Find the IP we can use to access the JupyterHub. Run the following command until the EXTERNAL-IP of the proxy-public service is available like in the example output. +```bash +kubectl --namespace get service proxy-public +``` +where is the namespace you installed JupyterHub into. The output will look like this: +```bash +NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE +proxy-public LoadBalancer 10.51.248.230 104.196.41.97 80:31916/TCP 1m +``` +To use JupyterHub, enter the external IP for the proxy-public service in to a browser. JupyterHub is running with a default dummy authenticator so entering any username and password combination will let you enter the hub. + +Congratulations! Now that you have basic JupyterHub running, you can extend it and optimize it in many ways to meet your needs. \ No newline at end of file