2021-12-04

横須賀ツアー

軍事

戦艦陸奥の主砲砲身
軍港ツアー
記念艦三笠

たまには外出しようということで、記念艦三笠や軍港ツアー目当てで、横須賀に行ってきました。
先週参加したhardeningの記憶が吹き飛ぶ程度には楽しめました。

戦艦陸奥の主砲砲身

横須賀駅からすぐのヴェルニー公園に、長門型2番艦の陸奥の第四主砲の砲身が展示されていました。
陸奥は、横須賀海軍工廠で建造された旧日本海軍の象徴でありながら、爆沈するという悲劇の艦歴も持っています。

f:id:FallenPigeon:20211204122232j:plain
f:id:FallenPigeon:20211204121325j:plain

隣の記念館では模型も展示されていました。

f:id:FallenPigeon:20211204121420j:plain
f:id:FallenPigeon:20211204115811j:plain

軍港ツアー

潜水艦、イージス艦、いずも、ロナルド・レーガン!

横須賀港には海上自衛隊と米海軍第7艦隊の基地が隣接しています。これらを遊覧するクルーズに参加しましたが、これほどたくさんの艦が港に停泊しているのは稀だと思います。

海自の潜水艦は2隻並んでいて、艦尾の舵が垂直なものが旧型、斜め上に2本出ているものが新型(おそらくそうりゅう型)のようです。

f:id:FallenPigeon:20211204123149j:plain

つぎはいずも。空母化改修で今後話題になるかもしれません。
全長は248m。

f:id:FallenPigeon:20211204125922j:plain

こちらはいせ。なんとヘリコプタも載っています。
f:id:FallenPigeon:20211204123309j:plain

続いて、ニミッツ級9番艦ロナルド・レーガン。
さすが330 mもあり、日本最大のいずもよりも一回り大きいサイズです。
原子力空母ははじめて拝見しました。

f:id:FallenPigeon:20211204123834j:plain

このほかにも多数の艦艇が並んでいてクルーズ船を右に左に移動して忙しかったです。

f:id:FallenPigeon:20211204131345j:plain

f:id:FallenPigeon:20211204131736j:plain

艦上で号笛練習をする方の姿もあり、手を振り返してくれました。
艦の特徴や常緑樹が多い理由などの解説もあり、終始充実した内容でした。

記念艦三笠

記念艦三笠にも行ってきました。三笠は日ロ戦争の日本海海戦における旗艦です。

f:id:FallenPigeon:20211204133342j:plain
f:id:FallenPigeon:20211204134428j:plain
f:id:FallenPigeon:20211204134831j:plain

両舷の副砲もゲーム内で見るより迫力があります。

f:id:FallenPigeon:20211204134512j:plain
f:id:FallenPigeon:20211204134801j:plain

艦内設備や展示も充実していて来た甲斐がありました。

f:id:FallenPigeon:20211204135152j:plain
f:id:FallenPigeon:20211204135238j:plain
f:id:FallenPigeon:20211204135440j:plain
f:id:FallenPigeon:20211204135602j:plain
f:id:FallenPigeon:20211204135002j:plain
f:id:FallenPigeon:20211204135759j:plain
f:id:FallenPigeon:20211204135858j:plain
f:id:FallenPigeon:20211204140006j:plain

当日はここで力尽き帰宅しました。
自衛隊の資料館や猿島にも興味がありましたが、またの機会におあずけです。

2021-12-04

はーどにんぐ 2021

Hardening

Hardening 2021 Active Faultに参加してきました。

所属チームエンディングムービー Aftershock
youtu.be

Hardeningは、6名-10名程度のチームでECサイトを運営して、堅牢化や攻撃に対する対処などの技術面に加えて、顧客からの問い合わせや会見などのビジネス面の対応力も問われる競技です。ビジネス面まで評価されるところはCTFとの違いと言えるかもしれません。

競技では、クローラによってECサイトの商品購入が行われ、最終的な売上が基本得点になります。そのため、ECサイトの可用性維持は重要です。ただ、厳格な堅牢化を行う猶予はないため、インシデントが起こる前提のもと、トリアージを行いながら対処をしていきます。当日は阿鼻叫喚の一言で、XSSやコマンドインジェクションをはじめとしたweb攻撃は雨あられのごとく降り注ぎ、初めから潜んでいたバックドアは元気に踊り出し、顧客情報や認証情報は社外に吸い出され、証明書の期限切れでwebサーバは沈黙し、ランサムウェアにコンテンツは改ざんされ、リーダーは会見にドナドナされ、混乱の中で実施した設定ミスはさらなるインシデントを起こし...その混乱の中で商品の入荷やサイト改修などを行っていきます。

ちなみに、Hardeningにはマーケットプレイス(MP)という仕組みがあり、売上げから代金を支払い、製品やサービスを導入することができます。こちらはオークション形式となっているため、他チームの入札金額を予想する高度な情報戦が発生します。所属チームでは、バラクーダ様のフルマネージドWAFサービスを購入させていただきました。一方、ランサムウェア復旧サービスという怪しげなMPもあり、ランサムウェア作成元の反社が運営するサービスというシナリオのようです。所属チームでは、この被害を免れましたが、他チームでは以下のようなやり取りがあったようです。復旧した後にまた暗号化するという畜生。

さて、初参加の私は最初から最後まで右往左往な状態でしたが、振り返るとやはり準備の大切さを再認識しました。ECサイトにせよ、ファイアウォールにせよ、練習していたものはスムーズに実施できた一方で、初めてさわるものは手こずり、設定ミスによる自爆も発生しました。加えて、個人の力量では到底対応できない量のタスクが発生するため、認識合わせや役割分担といったチームビルディングも重要だと感じました。個人的に至らなかった点も多々ありましたが、ある程度的を射た準備を行ったうえで競技に臨めたのは、リードしてくださった経験者メンバや熱心に準備に取り組んでくださったメンバのおかげです。

盛りだくさんなこともあり言葉では伝わらない部分も多いですが、当日の様子も一部公開されているので興味があればどうぞ。
Hardening Project (@WASForum) / Twitter
Hardening 2021 Active Fault開催のお知らせ | Web Application Security Forum
[H2021AF] Hardening 2021 Active Fault - Softening Day - YouTube

2021-11-25

マイニング暖房はじめました

仮想通貨マイニング

運用
- 熱
- 電気代
- 収益
構築
まとめ

生活費圧縮計画の一環で新しい暖房器具を導入しました。

冬の生活費でそれなりの存在感がある暖房費。
コスパを考えると、ガス暖房、電気系でいえば使用電力以上の熱エネルギーを取り込めるエアコンですが、フル稼働を考えると月5000円は必要。
何とか圧縮できないものかと悩んだ結果、思い至りました。
「マイニングで部屋温めたら、一石二鳥では?」

仮想通貨が高値を維持しているため、グラボ1枚で試験稼働させたところ、見事に黒字。
f:id:FallenPigeon:20211124232414p:plain
新品のグラボは採算が取れないほど高騰しているため、メルカリの値段アラートを設定して、安いものをかき集めました。
なんだかんだ2か月ほどかかりましたが、グラボ12枚のマイニングリグが竣工しました。

f:id:FallenPigeon:20211124230645j:plain

運用

熱

機材の温度は60-70度を維持し、生暖かい風が大量に排出されます。
11月に窓を開けてもお釣りが出るほどの排熱です。
当初の目的である暖房器具としての役割は達成です。グッバイ、エアコン。

f:id:FallenPigeon:20211124231022p:plain

電気代

排熱が十分であっても採算が取れなければただのポンコツ暖房です。
重要な要素がランニングコスト。つまり電気代です。
電源2台の構成をとっていますが、1日500円程度のようです。

f:id:FallenPigeon:20211124230715j:plain
f:id:FallenPigeon:20211124230618j:plain

収益

電気代だけ見ればガス暖房やエアコンが優位ですが、マイニング暖房の本領は仮想通貨のペイバックがあること。
一日の電気代500円に対して、2000円の収益。1500円の黒字。理論上は月換算で45000円の黒字です。
エアコン代を削減できることを考慮すると、採算は維持できそうです。

f:id:FallenPigeon:20211125001448p:plain

構築

マイニングソフト

ソフトは儲かる仮想通貨を自動マイニングするnice hashを採用しています。いわゆる脳死掘りというやつです。
具体的には、nice hashの専用osをusbブートで起動します。
現状ホットなイーサリアムマイニングが主軸ですが、報酬はビットコインで支払われます。

NiceHash - Leading Cryptocurrency Platform for Mining and Trading

グラボ構成

ハイエンドモデルは高コスト+LHRモデルなどの理由から対象外とし、コストパフォーマンスが一定以上期待できるローミドルモデルを中心としています。

GPU（グラボ）のマイニング性能一覧【2021】 | プロガジ
f:id:FallenPigeon:20211124231022p:plain

パーツ

電源

1200W電源を2機採用。ブレーカーも考慮して2系統のコンセントから給電しています。
グラボ一枚でCPUと同等以上の電力を消費するため、ある程度余裕を持った構成にしています。
f:id:FallenPigeon:20211125002610j:plain

マザーボード

PCIe12スロットの頭の悪いボードを使用します。
f:id:FallenPigeon:20211125002844j:plain

ライザーカード

グラボは基本的にpcie16を採用していますが、グラボ本体にこれらを配置するのは現実的ではありません。
そこで主流なのが多数のpcie1スロットを搭載したマザーにx16スロットを増設する方式です。

f:id:FallenPigeon:20211125003158j:plain

リグフレーム

次に登場するのが、ライザーカードやグラボを固定するフレーム機材です。空冷性能を考慮して基本野ざらしです。
メタルラックなどで代用する方もいらっしゃいますが、ショートしたら嫌なのでちゃんと購入しました。(見た目大事)

f:id:FallenPigeon:20211125003557j:plain

ファン

適当なものを8つ装着。
f:id:FallenPigeon:20211125003921j:plain

電源ボタン

f:id:FallenPigeon:20211125004045j:plain

無線端子

マシンの可搬性を上げるため、無線接続を採用。
f:id:FallenPigeon:20211125004120j:plain

ペリフェラル・SATA給電変換ケーブル

グラボの給電は6pin(75W)と8pin(150W)で行います。
特に悩ましいのが6pinと8pin配線の確保。
グラボ一つで2本も必要になるため、1200W電源であっても普通に足りません。
そこで登場するのが、ペリフェラル・SATAの給電ケーブルを6pinと8pinに変換する変換器です。
ここで重要なのは二股となっている点です。
ペリフェラルやSATAの給電ケーブルは75Wの定格を下回るため、脳死で配線すると発火します。
2系統から配線すれば定格範囲内で給電できるため、ケーブル不足を解決できます。

f:id:FallenPigeon:20211125004252j:plain
f:id:FallenPigeon:20211125004302j:plain

その他

メモリは最低限の8GB
CPUはi3
無駄な電力を浪費するHDDやSSDはなし。

まとめ

ディフィカルティ・ボムや相場を考慮すると楽観はできませんが、今年の冬は目論見どおりに運用できそうです。
サステナブル的な視点ではアレですが、暖房替わりであれば全然ありな気がします。

2021-09-18

Kubeletのポッド作成処理(docker-shim,containerd)

Kubernetes

podの中身
pod作成概要
ポッド作成の流れ

kubeletとCRIの処理を雑に確認したメモ。

podの中身

pod内のアプリコンテナでは、mnt,uts,pid以外のnamespaceが共有されています。
f:id:FallenPigeon:20210918194346p:plain

pod内ではアプリコンテナだけでなくpauseコンテナも動いています。

https://speakerdeck.com/devinjeon/kubernetes-pod-internals-with-the-fundamentals-of-containers
f:id:FallenPigeon:20210918194744p:plain

pauseコンテナには、ipc,network namespaceを提供してアプリコンテナの障害時にもネットワーク設定?を維持する役割があるらしいです。

このようにpodでは、(複数の)アプリコンテナ+pauseコンテナが動作していて、ipc,network namespaceを共有している状態になります。

www.suse.com

pod作成概要

k8sのworker nodeで動作するkubeletとコンテナランタイムの間でCRIという規格でポッドが作成されます。
ポッドの作成に関連する主なCRIは下記になります。
1.RunPodSandboxでpod(pauseコンテナ)を作成
2.CreateContainerでアプリコンテナを作成
3.StartContainerでアプリコンテナを作成

kubernetes.io
f:id:FallenPigeon:20210918195202p:plain

コンテナ作成については、containerdとruncの以前の記事で触れたので、今回はポッド作成処理の1を雑に追います。
RunPodSandboxは文字どおりポッド環境を作成するCRIで、ポッドネットワーク等を整備する処理になっているようです。
なお、コンテナランタイムによってポッドの定義は異なるため、katacontaneirのようなVMM型とcontainerd(runc)のようなkernel共有型で処理は異なるようです。

service RuntimeService {

    // Sandbox operations.

    rpc RunPodSandbox(RunPodSandboxRequest) returns (RunPodSandboxResponse) {}  
    rpc StopPodSandbox(StopPodSandboxRequest) returns (StopPodSandboxResponse) {}  
    rpc RemovePodSandbox(RemovePodSandboxRequest) returns (RemovePodSandboxResponse) {}  
    rpc PodSandboxStatus(PodSandboxStatusRequest) returns (PodSandboxStatusResponse) {}  
    rpc ListPodSandbox(ListPodSandboxRequest) returns (ListPodSandboxResponse) {}  

    // Container operations.  
    rpc CreateContainer(CreateContainerRequest) returns (CreateContainerResponse) {}  
    rpc StartContainer(StartContainerRequest) returns (StartContainerResponse) {}  
    rpc StopContainer(StopContainerRequest) returns (StopContainerResponse) {}  
    rpc RemoveContainer(RemoveContainerRequest) returns (RemoveContainerResponse) {}  
    rpc ListContainers(ListContainersRequest) returns (ListContainersResponse) {}  
    rpc ContainerStatus(ContainerStatusRequest) returns (ContainerStatusResponse) {}

    ...  
}

ポッド作成の流れ

kubeletのポッド作成は下記のようなフローで行われるようです。
1.ポッド作成のイベントを受信するとHandlePodAdditionsが呼び出される。
2.最終的にkubeRuntimeManager.SyncpodからdockershimやCRI経由でポッド作成やコンテナ作成が行われる。

toutiao.io
f:id:FallenPigeon:20210919120615p:plain
f:id:FallenPigeon:20210918200240j:plain

syncLoopIteration

func (kl *Kubelet) syncLoopIteration(configCh <-chan kubetypes.PodUpdate, handler SyncHandler,
	syncCh <-chan time.Time, housekeepingCh <-chan time.Time, plegCh <-chan *pleg.PodLifecycleEvent) bool {
	select {
	case u, open := <-configCh:
		// Update from a config source; dispatch it to the right handler
		// callback.
		if !open {
			klog.ErrorS(nil, "Update channel is closed, exiting the sync loop")
			return false
		}

		switch u.Op {
		case kubetypes.ADD:
			klog.V(2).InfoS("SyncLoop ADD", "source", u.Source, "pods", format.Pods(u.Pods))
			// After restarting, kubelet will get all existing pods through
			// ADD as if they are new pods. These pods will then go through the
			// admission process and *may* be rejected. This can be resolved
			// once we have checkpointing.
			handler.HandlePodAdditions(u.Pods)
		case kubetypes.UPDATE:
			klog.V(2).InfoS("SyncLoop UPDATE", "source", u.Source, "pods", format.Pods(u.Pods))
			handler.HandlePodUpdates(u.Pods)
		case kubetypes.REMOVE:
			klog.V(2).InfoS("SyncLoop REMOVE", "source", u.Source, "pods", format.Pods(u.Pods))
			handler.HandlePodRemoves(u.Pods)
		case kubetypes.RECONCILE:
			klog.V(4).InfoS("SyncLoop RECONCILE", "source", u.Source, "pods", format.Pods(u.Pods))
			handler.HandlePodReconcile(u.Pods)
		case kubetypes.DELETE:
			klog.V(2).InfoS("SyncLoop DELETE", "source", u.Source, "pods", format.Pods(u.Pods))
			// DELETE is treated as a UPDATE because of graceful deletion.
			handler.HandlePodUpdates(u.Pods)
		case kubetypes.SET:
			// TODO: Do we want to support this?
			klog.ErrorS(nil, "Kubelet does not support snapshot update")
		default:
			klog.ErrorS(nil, "Invalid operation type received", "operation", u.Op)
		}

HandlePodAdditions

HandlePodAdditions→dispatchWork→podWorkers.UpdatePodの順でポッド作成処理が呼び出されます。

func (kl *Kubelet) HandlePodAdditions(pods []*v1.Pod) {
	start := kl.clock.Now()
	sort.Sort(sliceutils.PodsByCreationTime(pods))
	for _, pod := range pods {
		existingPods := kl.podManager.GetPods()
		kl.podManager.AddPod(pod)

		if kubetypes.IsMirrorPod(pod) {
			kl.handleMirrorPod(pod, start)
			continue
		}

		if !kl.podWorkers.IsPodTerminationRequested(pod.UID) {
			// We failed pods that we rejected, so activePods include all admitted
			// pods that are alive.
			activePods := kl.filterOutInactivePods(existingPods)

			// Check if we can admit the pod; if not, reject it.
			if ok, reason, message := kl.canAdmitPod(activePods, pod); !ok {
				kl.rejectPod(pod, reason, message)
				continue
			}
		}
		mirrorPod, _ := kl.podManager.GetMirrorPodByPod(pod)
		//非同期のpod起動処理を実行
		kl.dispatchWork(pod, kubetypes.SyncPodCreate, mirrorPod, start)
		//ポッドが起動するコンテナそれぞれのStartup/Liveness/Readinessのprobe workerを起動
		kl.probeManager.AddPod(pod)
	}
}

func (kl *Kubelet) dispatchWork(pod *v1.Pod, syncType kubetypes.SyncPodType, mirrorPod *v1.Pod, start time.Time) {
	// Run the sync in an async worker.
	kl.podWorkers.UpdatePod(UpdatePodOptions{
		Pod:        pod,
		MirrorPod:  mirrorPod,
		UpdateType: syncType,
		StartTime:  start,
	})
	// Note the number of containers for new pods.
	if syncType == kubetypes.SyncPodCreate {
		metrics.ContainersPerPodCount.Observe(float64(len(pod.Spec.Containers)))
	}
}

UpdatePod

続いてUpdatePod→managePodLoop(goroutine)→syncPodFnで処理が移り、syncPodFnが実際にコンテナの起動処理を行っていきます。
syncPodFnでは、ポッドstatusをAPI serverと同期したり、pod用ディレクトリを作成しているらしいです。
そして、なんやかんやでkubeRuntimeManager.Syncpodを呼び出します。

func (p *podWorkers) UpdatePod(options UpdatePodOptions) {
...
	if podUpdates, exists = p.podUpdates[uid]; !exists {
		// We need to have a buffer here, because checkForUpdates() method that
		// puts an update into channel is called from the same goroutine where
		// the channel is consumed. However, it is guaranteed that in such case
		// the channel is empty, so buffer of size 1 is enough.
		podUpdates = make(chan podWork, 1)
		p.podUpdates[uid] = podUpdates

		// Creating a new pod worker either means this is a new pod, or that the
		// kubelet just restarted. In either case the kubelet is willing to believe
		// the status of the pod for the first pod worker sync. See corresponding
		// comment in syncPod.
		go func() {
			defer runtime.HandleCrash()
			p.managePodLoop(podUpdates)
		}()
	}
...

func (p *podWorkers) managePodLoop(podUpdates <-chan podWork) {
	var lastSyncTime time.Time
	for update := range podUpdates {
		pod := update.Options.Pod

		klog.V(4).InfoS("Processing pod event", "pod", klog.KObj(pod), "podUID", pod.UID, "updateType", update.WorkType)
		err := func() error {
			// The worker is responsible for ensuring the sync method sees the appropriate
			// status updates on resyncs (the result of the last sync), transitions to
			// terminating (no wait), or on terminated (whatever the most recent state is).
			// Only syncing and terminating can generate pod status changes, while terminated
			// pods ensure the most recent status makes it to the api server.
			var status *kubecontainer.PodStatus
			var err error
			switch {
			case update.Options.RunningPod != nil:
				// when we receive a running pod, we don't need status at all
			default:
				// wait until we see the next refresh from the PLEG via the cache (max 2s)
				// TODO: this adds ~1s of latency on all transitions from sync to terminating
				//  to terminated, and on all termination retries (including evictions). We should
				//  improve latency by making the the pleg continuous and by allowing pod status
				//  changes to be refreshed when key events happen (killPod, sync->terminating).
				//  Improving this latency also reduces the possibility that a terminated
				//  container's status is garbage collected before we have a chance to update the
				//  API server (thus losing the exit code).
				status, err = p.podCache.GetNewerThan(pod.UID, lastSyncTime)
			}
			if err != nil {
				// This is the legacy event thrown by manage pod loop all other events are now dispatched
				// from syncPodFn
				p.recorder.Eventf(pod, v1.EventTypeWarning, events.FailedSync, "error determining status: %v", err)
				return err
			}

			ctx := p.contextForWorker(pod.UID)

			// Take the appropriate action (illegal phases are prevented by UpdatePod)
			switch {
			case update.WorkType == TerminatedPodWork:
				err = p.syncTerminatedPodFn(ctx, pod, status)

			case update.WorkType == TerminatingPodWork:
				...
			default:
				err = p.syncPodFn(ctx, update.Options.UpdateType, pod, update.Options.MirrorPod, status)
			}

			lastSyncTime = time.Now()
			return err
		}()
...
}

syncPodFn

func (kl *Kubelet) syncPod(ctx context.Context, updateType kubetypes.SyncPodType, pod, mirrorPod *v1.Pod, podStatus *kubecontainer.PodStatus) error {
...
	// Record pod worker start latency if being created
	// TODO: make pod workers record their own latencies
	if updateType == kubetypes.SyncPodCreate {
		if !firstSeenTime.IsZero() {
			// This is the first time we are syncing the pod. Record the latency
			// since kubelet first saw the pod if firstSeenTime is set.
			metrics.PodWorkerStartDuration.Observe(metrics.SinceInSeconds(firstSeenTime))
		} else {
			klog.V(3).InfoS("First seen time not recorded for pod",
				"podUID", pod.UID,
				"pod", klog.KObj(pod))
		}
	}

	// Generate final API pod status with pod and status manager status
	apiPodStatus := kl.generateAPIPodStatus(pod, podStatus)
	// The pod IP may be changed in generateAPIPodStatus if the pod is using host network. (See #24576)
	// TODO(random-liu): After writing pod spec into container labels, check whether pod is using host network, and
	// set pod IP to hostIP directly in runtime.GetPodStatus
	podStatus.IPs = make([]string, 0, len(apiPodStatus.PodIPs))
	for _, ipInfo := range apiPodStatus.PodIPs {
		podStatus.IPs = append(podStatus.IPs, ipInfo.IP)
	}

	if len(podStatus.IPs) == 0 && len(apiPodStatus.PodIP) > 0 {
		podStatus.IPs = []string{apiPodStatus.PodIP}
	}

	// If the pod should not be running, we request the pod's containers be stopped. This is not the same
	// as termination (we want to stop the pod, but potentially restart it later if soft admission allows
	// it later). Set the status and phase appropriately
	runnable := kl.canRunPod(pod)
	if !runnable.Admit {
		// Pod is not runnable; and update the Pod and Container statuses to why.
		if apiPodStatus.Phase != v1.PodFailed && apiPodStatus.Phase != v1.PodSucceeded {
			apiPodStatus.Phase = v1.PodPending
		}
		apiPodStatus.Reason = runnable.Reason
		apiPodStatus.Message = runnable.Message
		// Waiting containers are not creating.
		const waitingReason = "Blocked"
		for _, cs := range apiPodStatus.InitContainerStatuses {
			if cs.State.Waiting != nil {
				cs.State.Waiting.Reason = waitingReason
			}
		}
		for _, cs := range apiPodStatus.ContainerStatuses {
			if cs.State.Waiting != nil {
				cs.State.Waiting.Reason = waitingReason
			}
		}
	}

	// Record the time it takes for the pod to become running.
	existingStatus, ok := kl.statusManager.GetPodStatus(pod.UID)
	if !ok || existingStatus.Phase == v1.PodPending && apiPodStatus.Phase == v1.PodRunning &&
		!firstSeenTime.IsZero() {
		metrics.PodStartDuration.Observe(metrics.SinceInSeconds(firstSeenTime))
	}

	kl.statusManager.SetPodStatus(pod, apiPodStatus)

	// Pods that are not runnable must be stopped - return a typed error to the pod worker
	if !runnable.Admit {
		klog.V(2).InfoS("Pod is not runnable and must have running containers stopped", "pod", klog.KObj(pod), "podUID", pod.UID, "message", runnable.Message)
		var syncErr error
		p := kubecontainer.ConvertPodStatusToRunningPod(kl.getRuntime().Type(), podStatus)
		if err := kl.killPod(pod, p, nil); err != nil {
			kl.recorder.Eventf(pod, v1.EventTypeWarning, events.FailedToKillPod, "error killing pod: %v", err)
			syncErr = fmt.Errorf("error killing pod: %v", err)
			utilruntime.HandleError(syncErr)
		} else {
			// There was no error killing the pod, but the pod cannot be run.
			// Return an error to signal that the sync loop should back off.
			syncErr = fmt.Errorf("pod cannot be run: %s", runnable.Message)
		}
		return syncErr
	}

	// If the network plugin is not ready, only start the pod if it uses the host network
	if err := kl.runtimeState.networkErrors(); err != nil && !kubecontainer.IsHostNetworkPod(pod) {
		kl.recorder.Eventf(pod, v1.EventTypeWarning, events.NetworkNotReady, "%s: %v", NetworkNotReadyErrorMsg, err)
		return fmt.Errorf("%s: %v", NetworkNotReadyErrorMsg, err)
	}

	...

	// Make data directories for the pod
	if err := kl.makePodDataDirs(pod); err != nil {
		...
	}

	// Volume manager will not mount volumes for terminating pods
	// TODO: once context cancellation is added this check can be removed
	if !kl.podWorkers.IsPodTerminationRequested(pod.UID) {
		...
	}

	// Fetch the pull secrets for the pod
	pullSecrets := kl.getPullSecretsForPod(pod)

	// Call the container runtime's SyncPod callback
	result := kl.containerRuntime.SyncPod(pod, podStatus, pullSecrets, kl.backOff)
	kl.reasonCache.Update(pod.UID, result)
	if err := result.Error(); err != nil {
		// Do not return error if the only failures were pods in backoff
		for _, r := range result.SyncResults {
			if r.Error != kubecontainer.ErrCrashLoopBackOff && r.Error != images.ErrImagePullBackOff {
				// Do not record an event here, as we keep all event logging for sync pod failures
				// local to container runtime so we get better errors
				return err
			}
		}

		return nil
	}

	return nil
}

kubeRuntimeManager.Syncpod

ようやくそれらしき処理が見えてきました。
4. Create sandbox if necessary.
6. Create init containers.
7. Create normal containers.
あたりでpodsandboxの作成やアプリコンテナの起動を行っているようです。

// SyncPod syncs the running pod into the desired pod by executing following steps:
//
//  1. Compute sandbox and container changes.
//  2. Kill pod sandbox if necessary.
//  3. Kill any containers that should not be running.
//  4. Create sandbox if necessary.
//  5. Create ephemeral containers.
//  6. Create init containers.
//  7. Create normal containers.
func (m *kubeGenericRuntimeManager) SyncPod(pod *v1.Pod, podStatus *kubecontainer.PodStatus, pullSecrets []v1.Secret, backOff *flowcontrol.Backoff) (result kubecontainer.PodSyncResult) {
	// Step 1: Compute sandbox and container changes.
	...

	// Step 2: Kill the pod if the sandbox has changed.
	...
		// Step 3: kill any running containers in this pod which are not to keep.
		...

	// Step 4: Create a sandbox for the pod if necessary.
	podSandboxID := podContainerChanges.SandboxID
	if podContainerChanges.CreateSandbox {
		var msg string
		var err error

		klog.V(4).InfoS("Creating PodSandbox for pod", "pod", klog.KObj(pod))
		metrics.StartedPodsTotal.Inc()
		createSandboxResult := kubecontainer.NewSyncResult(kubecontainer.CreatePodSandbox, format.Pod(pod))
		result.AddSyncResult(createSandboxResult)
		podSandboxID, msg, err = m.createPodSandbox(pod, podContainerChanges.Attempt)
		if err != nil {
			// createPodSandbox can return an error from CNI, CSI,
			// or CRI if the Pod has been deleted while the POD is
			// being created. If the pod has been deleted then it's
			// not a real error.
			//
			// SyncPod can still be running when we get here, which
			// means the PodWorker has not acked the deletion.
			if m.podStateProvider.IsPodTerminationRequested(pod.UID) {
				klog.V(4).InfoS("Pod was deleted and sandbox failed to be created", "pod", klog.KObj(pod), "podUID", pod.UID)
				return
			}
			metrics.StartedPodsErrorsTotal.WithLabelValues(err.Error()).Inc()
			createSandboxResult.Fail(kubecontainer.ErrCreatePodSandbox, msg)
			klog.ErrorS(err, "CreatePodSandbox for pod failed", "pod", klog.KObj(pod))
			ref, referr := ref.GetReference(legacyscheme.Scheme, pod)
			if referr != nil {
				klog.ErrorS(referr, "Couldn't make a ref to pod", "pod", klog.KObj(pod))
			}
			m.recorder.Eventf(ref, v1.EventTypeWarning, events.FailedCreatePodSandBox, "Failed to create pod sandbox: %v", err)
			return
		}
		klog.V(4).InfoS("Created PodSandbox for pod", "podSandboxID", podSandboxID, "pod", klog.KObj(pod))

		podSandboxStatus, err := m.runtimeService.PodSandboxStatus(podSandboxID)
		if err != nil {
			ref, referr := ref.GetReference(legacyscheme.Scheme, pod)
			if referr != nil {
				klog.ErrorS(referr, "Couldn't make a ref to pod", "pod", klog.KObj(pod))
			}
			m.recorder.Eventf(ref, v1.EventTypeWarning, events.FailedStatusPodSandBox, "Unable to get pod sandbox status: %v", err)
			klog.ErrorS(err, "Failed to get pod sandbox status; Skipping pod", "pod", klog.KObj(pod))
			result.Fail(err)
			return
		}

		// If we ever allow updating a pod from non-host-network to
		// host-network, we may use a stale IP.
		if !kubecontainer.IsHostNetworkPod(pod) {
			// Overwrite the podIPs passed in the pod status, since we just started the pod sandbox.
			podIPs = m.determinePodSandboxIPs(pod.Namespace, pod.Name, podSandboxStatus)
			klog.V(4).InfoS("Determined the ip for pod after sandbox changed", "IPs", podIPs, "pod", klog.KObj(pod))
		}
	}

	// the start containers routines depend on pod ip(as in primary pod ip)
	// instead of trying to figure out if we have 0 < len(podIPs)
	// everytime, we short circuit it here
	podIP := ""
	if len(podIPs) != 0 {
		podIP = podIPs[0]
	}

	// Get podSandboxConfig for containers to start.
	configPodSandboxResult := kubecontainer.NewSyncResult(kubecontainer.ConfigPodSandbox, podSandboxID)
	result.AddSyncResult(configPodSandboxResult)
	podSandboxConfig, err := m.generatePodSandboxConfig(pod, podContainerChanges.Attempt)
	if err != nil {
		message := fmt.Sprintf("GeneratePodSandboxConfig for pod %q failed: %v", format.Pod(pod), err)
		klog.ErrorS(err, "GeneratePodSandboxConfig for pod failed", "pod", klog.KObj(pod))
		configPodSandboxResult.Fail(kubecontainer.ErrConfigPodSandbox, message)
		return
	}

	// Helper containing boilerplate common to starting all types of containers.
	// typeName is a description used to describe this type of container in log messages,
	// currently: "container", "init container" or "ephemeral container"
	// metricLabel is the label used to describe this type of container in monitoring metrics.
	// currently: "container", "init_container" or "ephemeral_container"
	start := func(typeName, metricLabel string, spec *startSpec) error {
		startContainerResult := kubecontainer.NewSyncResult(kubecontainer.StartContainer, spec.container.Name)
		result.AddSyncResult(startContainerResult)

		isInBackOff, msg, err := m.doBackOff(pod, spec.container, podStatus, backOff)
		if isInBackOff {
			startContainerResult.Fail(err, msg)
			klog.V(4).InfoS("Backing Off restarting container in pod", "containerType", typeName, "container", spec.container, "pod", klog.KObj(pod))
			return err
		}

		metrics.StartedContainersTotal.WithLabelValues(metricLabel).Inc()
		klog.V(4).InfoS("Creating container in pod", "containerType", typeName, "container", spec.container, "pod", klog.KObj(pod))
		// NOTE (aramase) podIPs are populated for single stack and dual stack clusters. Send only podIPs.
		if msg, err := m.startContainer(podSandboxID, podSandboxConfig, spec, pod, podStatus, pullSecrets, podIP, podIPs); err != nil {
			// startContainer() returns well-defined error codes that have reasonable cardinality for metrics and are
			// useful to cluster administrators to distinguish "server errors" from "user errors".
			metrics.StartedContainersErrorsTotal.WithLabelValues(metricLabel, err.Error()).Inc()
			startContainerResult.Fail(err, msg)
			// known errors that are logged in other places are logged at higher levels here to avoid
			// repetitive log spam
			switch {
			case err == images.ErrImagePullBackOff:
				klog.V(3).InfoS("Container start failed in pod", "containerType", typeName, "container", spec.container, "pod", klog.KObj(pod), "containerMessage", msg, "err", err)
			default:
				utilruntime.HandleError(fmt.Errorf("%v %+v start failed in pod %v: %v: %s", typeName, spec.container, format.Pod(pod), err, msg))
			}
			return err
		}

		return nil
	}

	// Step 5: start ephemeral containers
	...

	// Step 6: start the init container.
	if container := podContainerChanges.NextInitContainerToStart; container != nil {
		// Start the next init container.
		if err := start("init container", metrics.InitContainer, containerStartSpec(container)); err != nil {
			return
		}

		// Successfully started the container; clear the entry in the failure
		klog.V(4).InfoS("Completed init container for pod", "containerName", container.Name, "pod", klog.KObj(pod))
	}

	// Step 7: start containers in podContainerChanges.ContainersToStart.
	for _, idx := range podContainerChanges.ContainersToStart {
		start("container", metrics.Container, containerStartSpec(&pod.Spec.Containers[idx]))
	}

	return
}

createPodSandbox

generatePodSandboxConfigでpodのconfigを生成し、m.runtimeService.RunPodSandboxでRunPodSandbox CRIを呼び出しているようです。

// createPodSandbox creates a pod sandbox and returns (podSandBoxID, message, error).
func (m *kubeGenericRuntimeManager) createPodSandbox(pod *v1.Pod, attempt uint32) (string, string, error) {
	podSandboxConfig, err := m.generatePodSandboxConfig(pod, attempt)
	if err != nil {
		message := fmt.Sprintf("Failed to generate sandbox config for pod %q: %v", format.Pod(pod), err)
		klog.ErrorS(err, "Failed to generate sandbox config for pod", "pod", klog.KObj(pod))
		return "", message, err
	}

	// Create pod logs directory
	err = m.osInterface.MkdirAll(podSandboxConfig.LogDirectory, 0755)
	if err != nil {
		message := fmt.Sprintf("Failed to create log directory for pod %q: %v", format.Pod(pod), err)
		klog.ErrorS(err, "Failed to create log directory for pod", "pod", klog.KObj(pod))
		return "", message, err
	}

	runtimeHandler := ""
	if m.runtimeClassManager != nil {
		runtimeHandler, err = m.runtimeClassManager.LookupRuntimeHandler(pod.Spec.RuntimeClassName)
		if err != nil {
			message := fmt.Sprintf("Failed to create sandbox for pod %q: %v", format.Pod(pod), err)
			return "", message, err
		}
		if runtimeHandler != "" {
			klog.V(2).InfoS("Running pod with runtime handler", "pod", klog.KObj(pod), "runtimeHandler", runtimeHandler)
		}
	}

	podSandBoxID, err := m.runtimeService.RunPodSandbox(podSandboxConfig, runtimeHandler)
	if err != nil {
		message := fmt.Sprintf("Failed to create sandbox for pod %q: %v", format.Pod(pod), err)
		klog.ErrorS(err, "Failed to create sandbox for pod", "pod", klog.KObj(pod))
		return "", message, err
	}

	return podSandBoxID, "", nil
}

RunPodSandbox(CRI client)

RunPodSandboxがgRPCとして定義されていて、CRI clientが実装されています。

func (r *remoteRuntimeService) RunPodSandbox(config *runtimeapi.PodSandboxConfig, runtimeHandler string) (string, error) {
	// Use 2 times longer timeout for sandbox operation (4 mins by default)
	// TODO: Make the pod sandbox timeout configurable.
	timeout := r.timeout * 2

	klog.V(10).InfoS("[RemoteRuntimeService] RunPodSandbox", "config", config, "runtimeHandler", runtimeHandler, "timeout", timeout)

	ctx, cancel := getContextWithTimeout(timeout)
	defer cancel()

	resp, err := r.runtimeClient.RunPodSandbox(ctx, &runtimeapi.RunPodSandboxRequest{
		Config:         config,
		RuntimeHandler: runtimeHandler,
	})
	if err != nil {
		klog.ErrorS(err, "RunPodSandbox from runtime service failed")
		return "", err
	}

	if resp.PodSandboxId == "" {
		errorMessage := fmt.Sprintf("PodSandboxId is not set for sandbox %q", config.GetMetadata())
		err := errors.New(errorMessage)
		klog.ErrorS(err, "RunPodSandbox failed")
		return "", err
	}

	klog.V(10).InfoS("[RemoteRuntimeService] RunPodSandbox Response", "podSandboxID", resp.PodSandboxId)

	return resp.PodSandboxId, nil
}

service RuntimeService {
    // Version returns the runtime name, runtime version, and runtime API version.
    rpc Version(VersionRequest) returns (VersionResponse) {}

    // RunPodSandbox creates and starts a pod-level sandbox. Runtimes must ensure
    // the sandbox is in the ready state on success.
    rpc RunPodSandbox(RunPodSandboxRequest) returns (RunPodSandboxResponse) {}
    // StopPodSandbox stops any running process that is part of the sandbox and
    // reclaims network resources (e.g., IP addresses) allocated to the sandbox.
    // If there are any running containers in the sandbox, they must be forcibly
    // terminated.
    // This call is idempotent, and must not return an error if all relevant
    // resources have already been reclaimed. kubelet will call StopPodSandbox
    // at least once before calling RemovePodSandbox. It will also attempt to
    // reclaim resources eagerly, as soon as a sandbox is not needed. Hence,
    // multiple StopPodSandbox calls are expected.
    rpc StopPodSandbox(StopPodSandboxRequest) returns (StopPodSandboxResponse) {}
    // RemovePodSandbox removes the sandbox. If there are any running containers
    // in the sandbox, they must be forcibly terminated and removed.
    // This call is idempotent, and must not return an error if the sandbox has
    // already been removed.
    rpc RemovePodSandbox(RemovePodSandboxRequest) returns (RemovePodSandboxResponse) {}
    // PodSandboxStatus returns the status of the PodSandbox. If the PodSandbox is not
    // present, returns an error.
    rpc PodSandboxStatus(PodSandboxStatusRequest) returns (PodSandboxStatusResponse) {}
    // ListPodSandbox returns a list of PodSandboxes.
    rpc ListPodSandbox(ListPodSandboxRequest) returns (ListPodSandboxResponse) {}

    // CreateContainer creates a new container in specified PodSandbox
    rpc CreateContainer(CreateContainerRequest) returns (CreateContainerResponse) {}
    // StartContainer starts the container.
    rpc StartContainer(StartContainerRequest) returns (StartContainerResponse) {}
    // StopContainer stops a running container with a grace period (i.e., timeout).
    // This call is idempotent, and must not return an error if the container has
    // already been stopped.
    // TODO: what must the runtime do after the grace period is reached?
    rpc StopContainer(StopContainerRequest) returns (StopContainerResponse) {}
    // RemoveContainer removes the container. If the container is running, the
    // container must be forcibly removed.
    // This call is idempotent, and must not return an error if the container has
    // already been removed.
    rpc RemoveContainer(RemoveContainerRequest) returns (RemoveContainerResponse) {}
    // ListContainers lists all containers by filters.
    rpc ListContainers(ListContainersRequest) returns (ListContainersResponse) {}
    // ContainerStatus returns status of the container. If the container is not
    // present, returns an error.
    rpc ContainerStatus(ContainerStatusRequest) returns (ContainerStatusResponse) {}
    // UpdateContainerResources updates ContainerConfig of the container.
    rpc UpdateContainerResources(UpdateContainerResourcesRequest) returns (UpdateContainerResourcesResponse) {}
    // ReopenContainerLog asks runtime to reopen the stdout/stderr log file
    // for the container. This is often called after the log file has been
    // rotated. If the container is not running, container runtime can choose
    // to either create a new log file and return nil, or return an error.
    // Once it returns error, new container log file MUST NOT be created.
    rpc ReopenContainerLog(ReopenContainerLogRequest) returns (ReopenContainerLogResponse) {}

    // ExecSync runs a command in a container synchronously.
    rpc ExecSync(ExecSyncRequest) returns (ExecSyncResponse) {}
    // Exec prepares a streaming endpoint to execute a command in the container.
    rpc Exec(ExecRequest) returns (ExecResponse) {}
    // Attach prepares a streaming endpoint to attach to a running container.
    rpc Attach(AttachRequest) returns (AttachResponse) {}
    // PortForward prepares a streaming endpoint to forward ports from a PodSandbox.
    rpc PortForward(PortForwardRequest) returns (PortForwardResponse) {}

    // ContainerStats returns stats of the container. If the container does not
    // exist, the call returns an error.
    rpc ContainerStats(ContainerStatsRequest) returns (ContainerStatsResponse) {}
    // ListContainerStats returns stats of all running containers.
    rpc ListContainerStats(ListContainerStatsRequest) returns (ListContainerStatsResponse) {}

    // PodSandboxStats returns stats of the pod sandbox. If the pod sandbox does not
    // exist, the call returns an error.
    rpc PodSandboxStats(PodSandboxStatsRequest) returns (PodSandboxStatsResponse) {}
    // ListPodSandboxStats returns stats of the pod sandboxes matching a filter.
    rpc ListPodSandboxStats(ListPodSandboxStatsRequest) returns (ListPodSandboxStatsResponse) {}

    // UpdateRuntimeConfig updates the runtime configuration based on the given request.
    rpc UpdateRuntimeConfig(UpdateRuntimeConfigRequest) returns (UpdateRuntimeConfigResponse) {}

    // Status returns the status of the runtime.
    rpc Status(StatusRequest) returns (StatusResponse) {}
}

func (c *runtimeServiceClient) RunPodSandbox(ctx context.Context, in *RunPodSandboxRequest, opts ...grpc.CallOption) (*RunPodSandboxResponse, error) {
	out := new(RunPodSandboxResponse)
	err := c.cc.Invoke(ctx, "/runtime.v1alpha2.RuntimeService/RunPodSandbox", in, out, opts...)
	if err != nil {
		return nil, err
	}
	return out, nil
}

RunPodSandbox(docker)

Step 2のds.client.CreateContainerでpauseコンテナの作成、Step 4のds.client.StartContainerで起動を行っています。
以降はdocker cliからdockerdにリクエストを送る要領でコンテナが作成されていくはずです。
他には、resolv.confのオーバーライドやCNIによるネットワーク設定が行われます。

kubernetes/pkg/kubelet/dockershim/docker_sandbox.go

// RunPodSandbox creates and starts a pod-level sandbox. Runtimes should ensure
// the sandbox is in ready state.
// For docker, PodSandbox is implemented by a container holding the network
// namespace for the pod.
// Note: docker doesn't use LogDirectory (yet).
func (ds *dockerService) RunPodSandbox(ctx context.Context, r *runtimeapi.RunPodSandboxRequest) (*runtimeapi.RunPodSandboxResponse, error) {
	config := r.GetConfig()

	// Step 1: Pull the image for the sandbox.
	image := defaultSandboxImage
	podSandboxImage := ds.podSandboxImage
	if len(podSandboxImage) != 0 {
		image = podSandboxImage
	}

	// NOTE: To use a custom sandbox image in a private repository, users need to configure the nodes with credentials properly.
	// see: https://kubernetes.io/docs/user-guide/images/#configuring-nodes-to-authenticate-to-a-private-registry
	// Only pull sandbox image when it's not present - v1.PullIfNotPresent.
	if err := ensureSandboxImageExists(ds.client, image); err != nil {
		return nil, err
	}

	// Step 2: Create the sandbox container.
	if r.GetRuntimeHandler() != "" && r.GetRuntimeHandler() != runtimeName {
		return nil, fmt.Errorf("RuntimeHandler %q not supported", r.GetRuntimeHandler())
	}
	createConfig, err := ds.makeSandboxDockerConfig(config, image)
	if err != nil {
		return nil, fmt.Errorf("failed to make sandbox docker config for pod %q: %v", config.Metadata.Name, err)
	}
	createResp, err := ds.client.CreateContainer(*createConfig)
	if err != nil {
		createResp, err = recoverFromCreationConflictIfNeeded(ds.client, *createConfig, err)
	}

	if err != nil || createResp == nil {
		return nil, fmt.Errorf("failed to create a sandbox for pod %q: %v", config.Metadata.Name, err)
	}
	resp := &runtimeapi.RunPodSandboxResponse{PodSandboxId: createResp.ID}

	ds.setNetworkReady(createResp.ID, false)
	defer func(e *error) {
		// Set networking ready depending on the error return of
		// the parent function
		if *e == nil {
			ds.setNetworkReady(createResp.ID, true)
		}
	}(&err)

	// Step 3: Create Sandbox Checkpoint.
	if err = ds.checkpointManager.CreateCheckpoint(createResp.ID, constructPodSandboxCheckpoint(config)); err != nil {
		return nil, err
	}

	// Step 4: Start the sandbox container.
	// Assume kubelet's garbage collector would remove the sandbox later, if
	// startContainer failed.
	err = ds.client.StartContainer(createResp.ID)
	if err != nil {
		return nil, fmt.Errorf("failed to start sandbox container for pod %q: %v", config.Metadata.Name, err)
	}

	// Rewrite resolv.conf file generated by docker.
	// NOTE: cluster dns settings aren't passed anymore to docker api in all cases,
	// not only for pods with host network: the resolver conf will be overwritten
	// after sandbox creation to override docker's behaviour. This resolv.conf
	// file is shared by all containers of the same pod, and needs to be modified
	// only once per pod.
	if dnsConfig := config.GetDnsConfig(); dnsConfig != nil {
		containerInfo, err := ds.client.InspectContainer(createResp.ID)
		if err != nil {
			return nil, fmt.Errorf("failed to inspect sandbox container for pod %q: %v", config.Metadata.Name, err)
		}

		if err := rewriteResolvFile(containerInfo.ResolvConfPath, dnsConfig.Servers, dnsConfig.Searches, dnsConfig.Options); err != nil {
			return nil, fmt.Errorf("rewrite resolv.conf failed for pod %q: %v", config.Metadata.Name, err)
		}
	}

	// Do not invoke network plugins if in hostNetwork mode.
	if config.GetLinux().GetSecurityContext().GetNamespaceOptions().GetNetwork() == runtimeapi.NamespaceMode_NODE {
		return resp, nil
	}

	// Step 5: Setup networking for the sandbox.
	// All pod networking is setup by a CNI plugin discovered at startup time.
	// This plugin assigns the pod ip, sets up routes inside the sandbox,
	// creates interfaces etc. In theory, its jurisdiction ends with pod
	// sandbox networking, but it might insert iptables rules or open ports
	// on the host as well, to satisfy parts of the pod spec that aren't
	// recognized by the CNI standard yet.
	cID := kubecontainer.BuildContainerID(runtimeName, createResp.ID)
	networkOptions := make(map[string]string)
	if dnsConfig := config.GetDnsConfig(); dnsConfig != nil {
		// Build DNS options.
		dnsOption, err := json.Marshal(dnsConfig)
		if err != nil {
			return nil, fmt.Errorf("failed to marshal dns config for pod %q: %v", config.Metadata.Name, err)
		}
		networkOptions["dns"] = string(dnsOption)
	}
	err = ds.network.SetUpPod(config.GetMetadata().Namespace, config.GetMetadata().Name, cID, config.Annotations, networkOptions)
	if err != nil {
		errList := []error{fmt.Errorf("failed to set up sandbox container %q network for pod %q: %v", createResp.ID, config.Metadata.Name, err)}

		// Ensure network resources are cleaned up even if the plugin
		// succeeded but an error happened between that success and here.
		err = ds.network.TearDownPod(config.GetMetadata().Namespace, config.GetMetadata().Name, cID)
		if err != nil {
			errList = append(errList, fmt.Errorf("failed to clean up sandbox container %q network for pod %q: %v", createResp.ID, config.Metadata.Name, err))
		}

		err = ds.client.StopContainer(createResp.ID, defaultSandboxGracePeriod)
		if err != nil {
			errList = append(errList, fmt.Errorf("failed to stop sandbox container %q for pod %q: %v", createResp.ID, config.Metadata.Name, err))
		}

		return resp, utilerrors.NewAggregate(errList)
	}

	return resp, nil
}

RunPodSandbox(containerd)

containerdのサーバ側の処理も確認してみると、docker-shimの処理にもあったようなコンテナ作成やネットワーク設定を行っているらしき処理が見えます。
containerdではコンテナをtaskとして管理するため、コンテナ作成処理はcontainer.NewTaskやtask.Startが該当すると思われます。

// RunPodSandbox creates and starts a pod-level sandbox. Runtimes should ensure
// the sandbox is in ready state.
func (c *criService) RunPodSandbox(ctx context.Context, r *runtime.RunPodSandboxRequest) (_ *runtime.RunPodSandboxResponse, retErr error) {
	config := r.GetConfig()
	log.G(ctx).Debugf("Sandbox config %+v", config)

	// Generate unique id and name for the sandbox and reserve the name.
	id := util.GenerateID()
	metadata := config.GetMetadata()
	if metadata == nil {
		return nil, errors.New("sandbox config must include metadata")
	}
	name := makeSandboxName(metadata)
	log.G(ctx).Debugf("Generated id %q for sandbox %q", id, name)
	// Reserve the sandbox name to avoid concurrent `RunPodSandbox` request starting the
	// same sandbox.
	if err := c.sandboxNameIndex.Reserve(name, id); err != nil {
		return nil, errors.Wrapf(err, "failed to reserve sandbox name %q", name)
	}
	defer func() {
		// Release the name if the function returns with an error.
		if retErr != nil {
			c.sandboxNameIndex.ReleaseByName(name)
		}
	}()

	// Create initial internal sandbox object.
	sandbox := sandboxstore.NewSandbox(
		sandboxstore.Metadata{
			ID:             id,
			Name:           name,
			Config:         config,
			RuntimeHandler: r.GetRuntimeHandler(),
		},
		sandboxstore.Status{
			State: sandboxstore.StateUnknown,
		},
	)

	// Ensure sandbox container image snapshot.
	image, err := c.ensureImageExists(ctx, c.config.SandboxImage, config)
	if err != nil {
		return nil, errors.Wrapf(err, "failed to get sandbox image %q", c.config.SandboxImage)
	}
	containerdImage, err := c.toContainerdImage(ctx, *image)
	if err != nil {
		return nil, errors.Wrapf(err, "failed to get image from containerd %q", image.ID)
	}

	ociRuntime, err := c.getSandboxRuntime(config, r.GetRuntimeHandler())
	if err != nil {
		return nil, errors.Wrap(err, "failed to get sandbox runtime")
	}
	log.G(ctx).Debugf("Use OCI %+v for sandbox %q", ociRuntime, id)

	podNetwork := true

	if goruntime.GOOS != "windows" &&
		config.GetLinux().GetSecurityContext().GetNamespaceOptions().GetNetwork() == runtime.NamespaceMode_NODE {
		// Pod network is not needed on linux with host network.
		podNetwork = false
	}
	if goruntime.GOOS == "windows" &&
		config.GetWindows().GetSecurityContext().GetHostProcess() {
		//Windows HostProcess pods can only run on the host network
		podNetwork = false
	}

	if podNetwork {
		// If it is not in host network namespace then create a namespace and set the sandbox
		// handle. NetNSPath in sandbox metadata and NetNS is non empty only for non host network
		// namespaces. If the pod is in host network namespace then both are empty and should not
		// be used.
		var netnsMountDir = "/var/run/netns"
		if c.config.NetNSMountsUnderStateDir {
			netnsMountDir = filepath.Join(c.config.StateDir, "netns")
		}
		sandbox.NetNS, err = netns.NewNetNS(netnsMountDir)
		if err != nil {
			return nil, errors.Wrapf(err, "failed to create network namespace for sandbox %q", id)
		}
		sandbox.NetNSPath = sandbox.NetNS.GetPath()
		defer func() {
			if retErr != nil {
				deferCtx, deferCancel := ctrdutil.DeferContext()
				defer deferCancel()
				// Teardown network if an error is returned.
				if err := c.teardownPodNetwork(deferCtx, sandbox); err != nil {
					log.G(ctx).WithError(err).Errorf("Failed to destroy network for sandbox %q", id)
				}

				if err := sandbox.NetNS.Remove(); err != nil {
					log.G(ctx).WithError(err).Errorf("Failed to remove network namespace %s for sandbox %q", sandbox.NetNSPath, id)
				}
				sandbox.NetNSPath = ""
			}
		}()

		// Setup network for sandbox.
		// Certain VM based solutions like clear containers (Issue containerd/cri-containerd#524)
		// rely on the assumption that CRI shim will not be querying the network namespace to check the
		// network states such as IP.
		// In future runtime implementation should avoid relying on CRI shim implementation details.
		// In this case however caching the IP will add a subtle performance enhancement by avoiding
		// calls to network namespace of the pod to query the IP of the veth interface on every
		// SandboxStatus request.
		if err := c.setupPodNetwork(ctx, &sandbox); err != nil {
			return nil, errors.Wrapf(err, "failed to setup network for sandbox %q", id)
		}
	}

	// Create sandbox container.
	// NOTE: sandboxContainerSpec SHOULD NOT have side
	// effect, e.g. accessing/creating files, so that we can test
	// it safely.
	spec, err := c.sandboxContainerSpec(id, config, &image.ImageSpec.Config, sandbox.NetNSPath, ociRuntime.PodAnnotations)
	if err != nil {
		return nil, errors.Wrap(err, "failed to generate sandbox container spec")
	}
	log.G(ctx).Debugf("Sandbox container %q spec: %#+v", id, spew.NewFormatter(spec))
	sandbox.ProcessLabel = spec.Process.SelinuxLabel
	defer func() {
		if retErr != nil {
			selinux.ReleaseLabel(sandbox.ProcessLabel)
		}
	}()

	// handle any KVM based runtime
	if err := modifyProcessLabel(ociRuntime.Type, spec); err != nil {
		return nil, err
	}

	if config.GetLinux().GetSecurityContext().GetPrivileged() {
		// If privileged don't set selinux label, but we still record the MCS label so that
		// the unused label can be freed later.
		spec.Process.SelinuxLabel = ""
	}

	// Generate spec options that will be applied to the spec later.
	specOpts, err := c.sandboxContainerSpecOpts(config, &image.ImageSpec.Config)
	if err != nil {
		return nil, errors.Wrap(err, "failed to generate sanbdox container spec options")
	}

	sandboxLabels := buildLabels(config.Labels, image.ImageSpec.Config.Labels, containerKindSandbox)

	runtimeOpts, err := generateRuntimeOptions(ociRuntime, c.config)
	if err != nil {
		return nil, errors.Wrap(err, "failed to generate runtime options")
	}

	snapshotterOpt := snapshots.WithLabels(snapshots.FilterInheritedLabels(config.Annotations))
	opts := []containerd.NewContainerOpts{
		containerd.WithSnapshotter(c.config.ContainerdConfig.Snapshotter),
		customopts.WithNewSnapshot(id, containerdImage, snapshotterOpt),
		containerd.WithSpec(spec, specOpts...),
		containerd.WithContainerLabels(sandboxLabels),
		containerd.WithContainerExtension(sandboxMetadataExtension, &sandbox.Metadata),
		containerd.WithRuntime(ociRuntime.Type, runtimeOpts)}

	container, err := c.client.NewContainer(ctx, id, opts...)
	if err != nil {
		return nil, errors.Wrap(err, "failed to create containerd container")
	}
	defer func() {
		if retErr != nil {
			deferCtx, deferCancel := ctrdutil.DeferContext()
			defer deferCancel()
			if err := container.Delete(deferCtx, containerd.WithSnapshotCleanup); err != nil {
				log.G(ctx).WithError(err).Errorf("Failed to delete containerd container %q", id)
			}
		}
	}()

	// Create sandbox container root directories.
	sandboxRootDir := c.getSandboxRootDir(id)
	if err := c.os.MkdirAll(sandboxRootDir, 0755); err != nil {
		return nil, errors.Wrapf(err, "failed to create sandbox root directory %q",
			sandboxRootDir)
	}
	defer func() {
		if retErr != nil {
			// Cleanup the sandbox root directory.
			if err := c.os.RemoveAll(sandboxRootDir); err != nil {
				log.G(ctx).WithError(err).Errorf("Failed to remove sandbox root directory %q",
					sandboxRootDir)
			}
		}
	}()
	volatileSandboxRootDir := c.getVolatileSandboxRootDir(id)
	if err := c.os.MkdirAll(volatileSandboxRootDir, 0755); err != nil {
		return nil, errors.Wrapf(err, "failed to create volatile sandbox root directory %q",
			volatileSandboxRootDir)
	}
	defer func() {
		if retErr != nil {
			// Cleanup the volatile sandbox root directory.
			if err := c.os.RemoveAll(volatileSandboxRootDir); err != nil {
				log.G(ctx).WithError(err).Errorf("Failed to remove volatile sandbox root directory %q",
					volatileSandboxRootDir)
			}
		}
	}()

	// Setup files required for the sandbox.
	if err = c.setupSandboxFiles(id, config); err != nil {
		return nil, errors.Wrapf(err, "failed to setup sandbox files")
	}
	defer func() {
		if retErr != nil {
			if err = c.cleanupSandboxFiles(id, config); err != nil {
				log.G(ctx).WithError(err).Errorf("Failed to cleanup sandbox files in %q",
					sandboxRootDir)
			}
		}
	}()

	// Update sandbox created timestamp.
	info, err := container.Info(ctx)
	if err != nil {
		return nil, errors.Wrap(err, "failed to get sandbox container info")
	}

	// Create sandbox task in containerd.
	log.G(ctx).Tracef("Create sandbox container (id=%q, name=%q).",
		id, name)

	taskOpts := c.taskOpts(ociRuntime.Type)
	// We don't need stdio for sandbox container.
	task, err := container.NewTask(ctx, containerdio.NullIO, taskOpts...)
	if err != nil {
		return nil, errors.Wrap(err, "failed to create containerd task")
	}
	defer func() {
		if retErr != nil {
			deferCtx, deferCancel := ctrdutil.DeferContext()
			defer deferCancel()
			// Cleanup the sandbox container if an error is returned.
			if _, err := task.Delete(deferCtx, WithNRISandboxDelete(id), containerd.WithProcessKill); err != nil && !errdefs.IsNotFound(err) {
				log.G(ctx).WithError(err).Errorf("Failed to delete sandbox container %q", id)
			}
		}
	}()

	// wait is a long running background request, no timeout needed.
	exitCh, err := task.Wait(ctrdutil.NamespacedContext())
	if err != nil {
		return nil, errors.Wrap(err, "failed to wait for sandbox container task")
	}

	nric, err := nri.New()
	if err != nil {
		return nil, errors.Wrap(err, "unable to create nri client")
	}
	if nric != nil {
		nriSB := &nri.Sandbox{
			ID:     id,
			Labels: config.Labels,
		}
		if _, err := nric.InvokeWithSandbox(ctx, task, v1.Create, nriSB); err != nil {
			return nil, errors.Wrap(err, "nri invoke")
		}
	}

	if err := task.Start(ctx); err != nil {
		return nil, errors.Wrapf(err, "failed to start sandbox container task %q", id)
	}

	if err := sandbox.Status.Update(func(status sandboxstore.Status) (sandboxstore.Status, error) {
		// Set the pod sandbox as ready after successfully start sandbox container.
		status.Pid = task.Pid()
		status.State = sandboxstore.StateReady
		status.CreatedAt = info.CreatedAt
		return status, nil
	}); err != nil {
		return nil, errors.Wrap(err, "failed to update sandbox status")
	}

	// Add sandbox into sandbox store in INIT state.
	sandbox.Container = container

	if err := c.sandboxStore.Add(sandbox); err != nil {
		return nil, errors.Wrapf(err, "failed to add sandbox %+v into store", sandbox)
	}

	// start the monitor after adding sandbox into the store, this ensures
	// that sandbox is in the store, when event monitor receives the TaskExit event.
	//
	// TaskOOM from containerd may come before sandbox is added to store,
	// but we don't care about sandbox TaskOOM right now, so it is fine.
	c.eventMonitor.startSandboxExitMonitor(context.Background(), id, task.Pid(), exitCh)

	return &runtime.RunPodSandboxResponse{PodSandboxId: id}, nil
}

startContainer

startContainerでは、イメージのプル、コンテナの作成、コンテナの実行が行われます。
デフォルトのサンドボックスイメージはgcr.io/google_containers/pause-amd64:3.0のようにpauseコンテナが指定されているらしいです。

// startContainer starts a container and returns a message indicates why it is failed on error.
// It starts the container through the following steps:
// * pull the image
// * create the container
// * start the container
// * run the post start lifecycle hooks (if applicable)
func (m *kubeGenericRuntimeManager) startContainer(podSandboxID string, podSandboxConfig *runtimeapi.PodSandboxConfig, spec *startSpec, pod *v1.Pod, podStatus *kubecontainer.PodStatus, pullSecrets []v1.Secret, podIP string, podIPs []string) (string, error) {
	container := spec.container

	// Step 1: pull the image.
	imageRef, msg, err := m.imagePuller.EnsureImageExists(pod, container, pullSecrets, podSandboxConfig)
	...

	// Step 2: create the container.
	// For a new container, the RestartCount should be 0
	restartCount := 0
	containerStatus := podStatus.FindContainerStatusByName(container.Name)
	...

	target, err := spec.getTargetID(podStatus)

	containerID, err := m.runtimeService.CreateContainer(podSandboxID, containerConfig, podSandboxConfig)

	// Step 3: start the container.
	err = m.runtimeService.StartContainer(containerID)

	m.recordContainerEvent(pod, container, containerID, v1.EventTypeNormal, events.StartedContainer, fmt.Sprintf("Started container %s", container.Name))

	containerMeta := containerConfig.GetMetadata()
	sandboxMeta := podSandboxConfig.GetMetadata()
	legacySymlink := legacyLogSymlink(containerID, containerMeta.Name, sandboxMeta.Name,
		sandboxMeta.Namespace)
	containerLog := filepath.Join(podSandboxConfig.LogDirectory, containerConfig.LogPath)

	// Step 4: execute the post start hook.
	if container.Lifecycle != nil && container.Lifecycle.PostStart != nil {
		kubeContainerID := kubecontainer.ContainerID{
			Type: m.runtimeName,
			ID:   containerID,
		}
		msg, handlerErr := m.runner.Run(kubeContainerID, pod, container, container.Lifecycle.PostStart)
	}
	return "", nil
}

2021-08-21

GitOps:GitHub Actions+Argo CD+Kubernetes

DevOps

環境
Argo CDインストール
リポジトリ登録
- k8sにデプロイするサンプルアプリとサービス
手動同期
自動同期
- CICDパイプラインの作成

kurobato.hateblo.jp
前回はDockerfileのリポジトリを更新すると自動ビルドされたコンテナイメージがdocker hubにプッシュされる環境を用意しました。
次は、Argo CDというツールを使って、k8s マニフェストのリポジトリを更新すると自動でコンテナがデプロイされる環境を構築します。

argoproj.github.io

環境

実機ホストOS:ubuntu
Kubernetes(minikube+virtualbox)

参考:k8s構築手順
Istio:マイクロサービス基盤入門 - 鳩小屋

Argo CDインストール

argocdネームスペースを作成してminikube k8s上にargoCDをインストールします。

#k8sロードバランサの用意(argocd-serverへアクセス用)
#minikubeで提供されるtunnelを起動します。
$ minikube tunnel
Status:	
	machine: minikube
	pid: 4324
	route: 10.96.0.0/12 -> 192.168.99.100
	minikube: Running
	services: [argocd-server, gitops-service, istio-ingressgateway]

#k8s node確認
$ kubectl get nodes -o wide
NAME           STATUS   ROLES                  AGE    VERSION   INTERNAL-IP      EXTERNAL-IP   OS-IMAGE               KERNEL-VERSION   CONTAINER-RUNTIME
minikube       Ready    control-plane,master   11d    v1.21.2   192.168.99.100           Buildroot 2020.02.12   4.19.182         docker://20.10.6
minikube-m02   Ready                     136m   v1.21.2   192.168.99.101           Buildroot 2020.02.12   4.19.182         docker://20.10.6
minikube-m03   Ready                     135m   v1.21.2   192.168.99.102           Buildroot 2020.02.12   4.19.182         docker://20.10.6

#argocdのインストール
$ kubectl create namespace argocd
$ kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml

ポッドの確認

いろいろポッドがデプロイされています。
argocd-serverがユーザアクセス用のサーバみたいです。

$ kubectl get pod -n argocd
NAME                                  READY   STATUS    RESTARTS   AGE
argocd-application-controller-0       1/1     Running   0          10m
argocd-dex-server-68c7bf5fdd-flk9c    1/1     Running   0          10m
argocd-redis-7547547c4f-pcmlk         1/1     Running   0          10m
argocd-repo-server-58f87478b8-lhg78   1/1     Running   0          10m
argocd-server-6f4fcdc5dc-bpmgc        1/1     Running   0          10m

サービスの確認

argocd-serverはデフォルトでは外部公開されていません。
argocd-serverに接続できるようにサービスタイプをLoadBalancerに変更します。

$ kubectl patch svc argocd-server -n argocd -p '{"spec": {"type": "LoadBalancer"}}'

$ kubectl get services -n argocd
NAME                    TYPE           CLUSTER-IP      EXTERNAL-IP    PORT(S)                      AGE
argocd-dex-server       ClusterIP      10.98.145.53             5556/TCP,5557/TCP,5558/TCP   3h5m
argocd-metrics          ClusterIP      10.110.187.11            8082/TCP                     3h5m
argocd-redis            ClusterIP      10.96.238.136            6379/TCP                     3h5m
argocd-repo-server      ClusterIP      10.107.230.1             8081/TCP,8084/TCP            3h5m
argocd-server           LoadBalancer   10.107.71.74    10.107.71.74   80:30364/TCP,443:32221/TCP   3h5m
argocd-server-metrics   ClusterIP      10.107.149.9             8083/TCP                     3h5m

これでargocd-serverサービス経由でargocd-serverポッドにアクセスできます。

CLIのダウンロード

sudo curl -sSL -o /usr/local/bin/argocd https://github.com/argoproj/argo-cd/releases/latest/download/argocd-linux-amd64
sudo chmod +x /usr/local/bin/argocd

argocd-server へのアクセス

10.107.71.74:80にアクセスします。

f:id:FallenPigeon:20210821094108p:plain

初期パスワードの確認(初期ユーザはadmin)

$ kubectl -n argocd get secret argocd-initial-admin-secret -o jsonpath="{.data.password}" | base64 -d && echo
XXnTZyfLjyehwude

リポジトリ登録

f:id:FallenPigeon:20210821105042p:plain

k8s マニフェストのあるリポジトリをargocdに登録します。

f:id:FallenPigeon:20210821104641p:plain

k8sにデプロイするサンプルアプリとサービス

Hello GitOps!を返すwebサーバコンテナをk8sのdeployment podとしてデプロイし、ロードバランサで公開します。

package main

import (
  "fmt"
  "net/http"
)

func handler(w http.ResponseWriter, r *http.Request){
  fmt.Fprintf(w,"Hello GitOps!!")
}

func main(){
  http.HandleFunc("/",handler)
  http.ListenAndServe(":8080",nil)
}

# Stage-1
FROM golang:1.16 as builder
COPY ./app/main.go ./
RUN go build -o /gitops-go-app ./main.go

# Satge-2
FROM ubuntu
EXPOSE 8080
COPY --from=builder /gitops-go-app /.
ENTRYPOINT ["./gitops-go-app"]

apiVersion: apps/v1
kind: Deployment
metadata:
  name: gitops-deployment
spec:
  replicas: 3
  selector:
    matchLabels:
      app: gitops
  template:
    metadata:
      labels:
        app: gitops
    spec:
      containers:
      - name: gitops
        image: docker.io/l3j5g7d9/gitops-go-app:latest
        imagePullPolicy: IfNotPresent

apiVersion: v1
kind: Service
metadata:
  name: gitops-service
spec:
  type: LoadBalancer
  ports:
    - name: gitops
      protocol: TCP
      port: 80
      targetPort: 8080
  selector:
    app: gitops

f:id:FallenPigeon:20210821104711p:plain

マニフェストが認識されました。

手動同期

設定が手動同期になっているため、同期ボタンをポチります。
f:id:FallenPigeon:20210821105145p:plain

すると、deployment podが展開されたような表示になります。
f:id:FallenPigeon:20210821105158p:plain

k8s上で動いているか確認します。

$ kubectl get pods -n default
NAME                                 READY   STATUS    RESTARTS   AGE
gitops-deployment-64879cfb89-n48hc   2/2     Running   0          21m
gitops-deployment-64879cfb89-q878s   2/2     Running   0          21m
gitops-deployment-64879cfb89-v58gk   2/2     Running   0          21m

$ kubectl get services -n default
NAME             TYPE           CLUSTER-IP      EXTERNAL-IP     PORT(S)        AGE
gitops-service   LoadBalancer   10.106.186.67   10.106.186.67   80:32018/TCP   2m45s
kubernetes       ClusterIP      10.96.0.1                 443/TCP        11d

$ curl 10.106.186.67:80
Hello GitOps!!

ポッドとサービスが作成され、ロードバランサ経由でwebサーバにアクセスできています。

自動同期

次は自動同期を有効化します。これでk8s リポジトリを変更すると自動検出してk8sにデプロイしてくれるようになるっぽいです。
f:id:FallenPigeon:20210821111202p:plain

リポジトリのk8s マニフェストのポッド名を変更してみます。
f:id:FallenPigeon:20210821111732p:plain

すると、argoとk8s上で変更内容が自動反映されました。

f:id:FallenPigeon:20210821112103p:plain

$ kubectl get pods -n default
NAME                                         READY   STATUS    RESTARTS   AGE
gitops-deployment-changed-64879cfb89-q2gnj   2/2     Running   0          5m21s
gitops-deployment-changed-64879cfb89-rk6cn   2/2     Running   0          5m21s
gitops-deployment-changed-64879cfb89-trjn9   2/2     Running   0          5m21s

$ kubectl get services -n default
NAME             TYPE           CLUSTER-IP      EXTERNAL-IP     PORT(S)        AGE
gitops-service   LoadBalancer   10.106.186.67   10.106.186.67   80:32018/TCP   43m
kubernetes       ClusterIP      10.96.0.1                 443/TCP        11d

$ curl 10.106.186.67:80
Hello GitOps!!

これにてCD完成です。GUIポチポチできるので楽ちんですね。

CICDパイプラインの作成

ここまでGithub Actions(CI)とArgo CD(CD)の環境を構築してきました。
最後にこれらを統合してCICDパイプラインにします。
つまり、アプリケーションのソースコードとDockerfileをリポジトリにプッシュするとビルドからデプロイまで全自動で行われるようにします。

1.Github Actions:Dockerfileのリポジトリ更新をトリガとして自動ビルド+コンテナイメージのプッシュ
2.Argo CD:k8s マニフェストのリポジトリ更新をトリガとして自動デプロイ

方法としては、Github Actionsの処理でk8s マニフェストのリポジトリを更新する処理を加えることでArgo CDの同期処理が起動するようにします。
処理としてはapp.yamlのポッド名をgitops-deployment[デプロイ番号]に書き換えてプッシュするだけです。
実際には、Helmも組み込んでKubernetes マニフェストを管理するのが無難っぽいのですが、少しややこしくなるので今回は省きます。

余談ですがyqコマンドなんてあったんですね。最近インフラレイヤはyamlだらけなので重宝しそうです。

code/.github/workflows$ nano main.yml 
name: Github Action CI

on:
  push:
    branches: [ main ]

jobs:
  build:
    name: GitOps Workflow
    runs-on: ubuntu-latest
    steps:
      - name: Checkout code
        uses: actions/checkout@v2
      #Buildkitによるイメージビルド
      - name: Build an image from Dockerfile
        run: |
          DOCKER_BUILDKIT=1 docker image build . -f app/Dockerfile --tag ${{ secrets.DOCKERUSER }}/gitops-go-app:latest
      #Trivyによるイメージスキャン  
      - name: Run Trivy
        uses: aquasecurity/trivy-action@master
        with:
          image-ref: '${{ secrets.DOCKERUSER }}/gitops-go-app:latest'
          format: 'table'
          exit-code: '1'
          ignore-unfixed: true
          severity: 'CRITICAL,HIGH'
      #DockerHubにイメージプッシュ
      - name: Push Image
        run: |
          docker login docker.io --username ${{ secrets.DOCKERUSER }} --password ${{ secrets.DOCKERPASSWORD }}
          docker image push ${{ secrets.DOCKERUSER }}/gitops-go-app:latest
      #Kubernetesマニフェストの更新
      - name: Change Pod Name
        run: |
          echo -e "machine github.com\nlogin ${{ secrets.GITHUBUSER }}\npassword ${{ secrets.GITHUBTOKEN }}" > ~/.netrc
          git config --global user.email ${{ secrets.EMAIL }}
          git config --global user.name ${{ secrets.GITHUBUSER }}
          git clone https://github.com/${{ secrets.GITHUBUSER }}/config.git
          cd config/manifest
          yq e '.metadata.name = "gitops-deployment${{ github.run_number }}"' -i app.yaml
          git add app.yaml
          git commit -m ${{ github.run_number }} -a
          git push origin main

Github Actionsのワークフロー完了後にリポジトリのapp.yamlを確認するとポッドラベルが更新されています。

f:id:FallenPigeon:20210821170355p:plain

f:id:FallenPigeon:20210821170410p:plain

次にArgo CDのコンソールを確認すると、更新されたポッド名が反映されています。

f:id:FallenPigeon:20210821170834p:plain

k8sでもちゃんと動いています。

$ kubectl get pods -n default
NAME                                   READY   STATUS    RESTARTS   AGE
gitops-deployment31-64879cfb89-6ds4x   2/2     Running   0          24m
gitops-deployment31-64879cfb89-hfhh2   2/2     Running   0          24m
gitops-deployment31-64879cfb89-tmlf9   2/2     Running   0          24m

$ kubectl get services -n default
NAME             TYPE           CLUSTER-IP      EXTERNAL-IP     PORT(S)        AGE
gitops-service   LoadBalancer   10.106.186.67   10.106.186.67   80:32018/TCP   6h31m
kubernetes       ClusterIP      10.96.0.1                 443/TCP        12d

$ curl 10.106.186.67:80
Hello GitOps!!

これでCICDパイプラインの完成です。
とりあえず動いたので満足。

2021-08-15

Github Actions:CI

DevOps

アプリケーションとDockerfileの用意
GitHub Actionsのセットアップ
リポジトリへのプッシュ
ワークフローの確認
Docker Hubの確認
まとめ

CICDはAWS CodeXシリーズしか触ったことがなかったのでGithub Actions(CI)を少し動かしてみました。
Dockerfileやアプリケーションを更新すると自動でイメージビルドとイメージプッシュが行われるシンプルな環境を構成します。

アプリケーションとDockerfileの用意

user@user-HP:~/gitops/code/app$ ls
Dockerfile  main.go

user@user-HP:~/gitops/code/app$ cat main.go 
package main

import (
  "fmt"
  "net/http"
)

func handler(w http.ResponseWriter, r *http.Request){
  fmt.Fprintf(w,"Hello GitOps!!")
}

func main(){
  http.HandleFunc("/",handler)
  http.ListenAndServe(":8080",nil)
}

user@user-HP:~/gitops/code/app$ cat Dockerfile 
# Stage-1
FROM golang:1.16 as builder
COPY ./app/main.go ./
RUN go build -o /gitops-go-app ./main.go

# Satge-2
FROM ubuntu
EXPOSE 8080
COPY --from=builder /gitops-go-app /.
ENTRYPOINT ["./gitops-go-app"]

上記が格納されたgitリポジトリを用意します。
f:id:FallenPigeon:20210815180703p:plain

GitHub Actionsのセットアップ

f:id:FallenPigeon:20210815180844p:plain
f:id:FallenPigeon:20210815181101p:plain

Secretの登録

ワークフローの設定で参照するシークレットを作成します。
f:id:FallenPigeon:20210815181437p:plain

ローカルリポジトリとの同期

user@user-HP:~/gitops/code$ git pull

ワークフローの設定

user@user-HP:~/gitops/code/.github/workflows$ nano main.yml                                                                       

name: Github Action CI

# mainブランチへのプッシュをトリガーにする
on:
  push:
    branches: [ main ]

jobs:
  build:
    name: GitOps Workflow
    runs-on: ubuntu-latest
    steps:
      - name: Checkout code
        uses: actions/checkout@v2
      #Buildkitによるイメージビルド
      - name: Build an image from Dockerfile
        run: |
          DOCKER_BUILDKIT=1 docker image build . -f app/Dockerfile --tag ${{ secrets.DOCKERUSER }}/gitops-go-app:latest
      #Trivyによるイメージスキャン  
      - name: Run Trivy
        uses: aquasecurity/trivy-action@master
        with:
          image-ref: '${{ secrets.DOCKERUSER }}/gitops-go-app:latest'
          format: 'table'
          exit-code: '1'
          ignore-unfixed: true
          severity: 'CRITICAL,HIGH'
      #Docker Hubにイメージプッシュ
      - name: Push Image
        run: |
          docker login docker.io --username ${{ secrets.DOCKERUSER }} --password ${{ secrets.DOCKERPASSWORD }}
          docker image push ${{ secrets.DOCKERUSER }}/gitops-go-app:latest

リポジトリへのプッシュ

user@user-HP:~/gitops/code$ git add .
user@user-HP:~/gitops/code$ git commit -m "main.yml change"
user@user-HP:~/gitops/code$ git push -u origin main

ワークフローの確認

自動でワークフローが起動します。

f:id:FallenPigeon:20210815182859p:plain

エラーが出たとき
f:id:FallenPigeon:20210815183412p:plain

成功時
f:id:FallenPigeon:20210815183128p:plain

Docker Hubの確認

ワークフローが完了するとlatestタグのついたイメージが格納されているのが確認できます。
f:id:FallenPigeon:20210815183607p:plain

まとめ

セキュリティ観点では、ソースコード、Dockerfile、コンテナイメージの診断処理も実施すると良さそうです。
Argo CD等でKubernetesと連携すれば、コンテナ型CICDパイプラインが完成するはず。

2021-08-10

Istio:マイクロサービス基盤入門

マイクロサービス Kubernetes

virtualboxのインストール
kubectlのインストール
dockerのインストール(VirtualBox型Kubernetesでは不要)
minikubeのインストール
kubernetesの構築
- virtualbox版
- docker版
Istioの構築
- Istioパッケージのダウンロード
- サンプル設定の適用
サンプルアプリケーション
- 初期構成の確認
- 外部公開

Istioデモ環境構築のメモです。
minikubeでvirtualbox版Kubernetesを構築し、Istioをインストールします。

virtualboxのインストール

$ cat /etc/os-release
NAME="Ubuntu"
VERSION="21.04 (Hirsute Hippo)"

#virtualboxのインストール
$ sudo apt-get install virtualbox

kubectlのインストール

#curlのインストール
$ sudo apt install curl

#kubectlのダウンロード
$ curl -LO "https://storage.googleapis.com/kubernetes-release/release/$(curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt)/bin/linux/amd64/kubectl"
#実行権限付与
$ chmod +x ./kubectl
#パスを通す
$ sudo mv ./kubectl /usr/local/bin/kubectl

kubectlの動作確認
$ kubectl version --client
Client Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.0", GitCommit:"c2b5237ccd9c0f1d600d3072634ca66cefdf272f", GitTreeState:"clean", BuildDate:"2021-08-04T18:03:20Z", GoVersion:"go1.16.6", Compiler:"gc", Platform:"linux/amd64"}

dockerのインストール(VirtualBox型Kubernetesでは不要)

#パッケージ更新
$ sudo apt update
#必要なパッケージをインストール
$ sudo apt install apt-transport-https ca-certificates software-properties-common
#dockerリポジトリの追加
$ curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
$ sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu focal stable"
#パッケージ更新
$ sudo apt update
#dockerのインストール
$ sudo apt install docker-ce
#dockerの動作確認
$ systemctl status docker
● docker.service - Docker Application Container Engine
     Loaded: loaded (/lib/systemd/system/docker.service; enabled; vendor preset: enabled)
     Active: active (running) since Mon 2021-08-09 14:18:39 JST; 3h 4min ago
TriggeredBy: ● docker.socket
       Docs: https://docs.docker.com
   Main PID: 1436 (dockerd)
      Tasks: 12
     Memory: 12.2M
     CGroup: /system.slice/docker.service
             └─1436 /usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock

$ docker version
user@user-HP:~$ docker version
Client: Docker Engine - Community
 Version:           20.10.8
 API version:       1.41
 Go version:        go1.16.6
 Git commit:        3967b7d
 Built:             Fri Jul 30 19:54:27 2021
 OS/Arch:           linux/amd64
 Context:           default
 Experimental:      true

#sudo を省略するためにuserをDockerに追加
$ sudo usermod -aG docker user

#ログアウト後に再起動
$ sudo systemctl restart docker

minikubeのインストール

#minikubeバイナリのダウンロード
$ curl -Lo minikube https://storage.googleapis.com/minikube/releases/latest/minikube-linux-amd64
#実行権限付与
$ chmod +x minikube
#パス設定
$ sudo mkdir -p /usr/local/bin/
$ sudo install minikube /usr/local/bin/

kubernetesの構築

virtualbox版

node数を3とすると、マスターノードが1つ、ワーカノードが2つ、合計3つのVMが起動します。
今回のIstio構築では、実機構成に近いvirtualboxパターンを利用します。

#kubernetesの構築
$ minikube start --vm-driver=virtualbox node 3

f:id:FallenPigeon:20210809175104p:plain

#ノード情報を確認
$ kubectl get nodes -o wide
NAME           STATUS   ROLES                  AGE     VERSION   INTERNAL-IP      EXTERNAL-IP   OS-IMAGE               KERNEL-VERSION   CONTAINER-RUNTIME
minikube       Ready    control-plane,master   3h48m   v1.21.2   192.168.99.100           Buildroot 2020.02.12   4.19.182         docker://20.10.6
minikube-m02   Ready                     3h21m   v1.21.2   192.168.99.101           Buildroot 2020.02.12   4.19.182         docker://20.10.6
minikube-m03   Ready                     3h      v1.21.2   192.168.99.102           Buildroot 2020.02.12   4.19.182         docker://20.10.6

CONTAINER-RUNTIMEがdockerになっていますが、これはホストOSのDockerではなく、virtualboxのvm上で動作しているDockerになります。

docker版

minikube start --driver=docker

Istioの構築

以降、公式の手順でIstio環境を構築します。
Istio / Getting Started

Istioパッケージのダウンロード

$ curl -L https://istio.io/downloadIstio | sh -
$ cd istio-1.10.3
#istioctlのパス設定
$ sudo mv bin/istioctl /usr/local/bin

サンプル設定の適用

$ istioctl install --set profile=demo -y
#Envoyサイドカープロキシの自動挿入設定
$ kubectl label namespace default istio-injection=enabled

サンプルアプリケーション

$ kubectl apply -f samples/bookinfo/platform/kube/bookinfo.yaml

構成ファイルの中身を確認してみます。Details service、Ratings service、Reviews service、Productpage servicesの4つの(Kubernetes)Serviceが定義されています。
具体的には、kind: Deploymentがアプリケーションポッド(コンテナ)、kind: ServiceAccountがポッドに割り当てるサービスアカウント(AWS IAMのようなもの)、kind: Serviceがコンテナへのポートバインドを表しています。
特にこのService(ClusterIP)は、Kubernetes上でのみ通用する仮想IPアドレス(ポッドに割り当てられるClusterIP)の9080ポートとコンテナ本体の9080ポートを紐付けています。
これによって、Kubernetes内のポッドからClusterIP:9080にアクセスすればコンテナ本体の9080ポートにアクセスできることになります。ただし、外部との疎通はありません。

$ less samples/bookinfo/platform/kube/bookinfo.yaml
##################################################################################################
# Details service
##################################################################################################
apiVersion: v1
kind: Service
metadata:
  name: details
  labels:
    app: details
    service: details
spec:
  ports:
  - port: 9080
    name: http
  selector:
    app: details

apiVersion: v1
kind: ServiceAccount
metadata:
  name: bookinfo-details
  labels:
    account: details

apiVersion: apps/v1
kind: Deployment
metadata:
  name: details-v1
  labels:
    app: details
    version: v1
spec:
  replicas: 1
  selector:
    matchLabels:
      app: details
      version: v1
  template:
    metadata:
      labels:
        app: details
        version: v1
    spec:
      serviceAccountName: bookinfo-details
      containers:
      - name: details
        image: docker.io/istio/examples-bookinfo-details-v1:1.16.2
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 9080
        securityContext:
          runAsUser: 1000

##################################################################################################
# Ratings service
##################################################################################################
apiVersion: v1
kind: Service
metadata:
  name: ratings
  labels:
    app: ratings
    service: ratings
spec:
  ports:
  - port: 9080
    name: http
  selector:
    app: ratings

apiVersion: v1
kind: ServiceAccount
metadata:
  name: bookinfo-ratings
  labels:
    account: ratings

apiVersion: apps/v1
kind: Deployment
metadata:
  name: ratings-v1
  labels:
    app: ratings
    version: v1
spec:
  replicas: 1
  selector:
    matchLabels:
      app: ratings
      version: v1
  template:
    metadata:
      labels:
        app: ratings
        version: v1
    spec:
      serviceAccountName: bookinfo-ratings
      containers:
      - name: ratings
        image: docker.io/istio/examples-bookinfo-ratings-v1:1.16.2
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 9080
        securityContext:
          runAsUser: 1000

##################################################################################################
# Reviews service
##################################################################################################
apiVersion: v1
kind: Service
metadata:
  name: reviews
  labels:
    app: reviews
    service: reviews
spec:
  ports:
  - port: 9080
    name: http
  selector:
    app: reviews

apiVersion: v1
kind: ServiceAccount
metadata:
  name: bookinfo-reviews
  labels:
    account: reviews

apiVersion: apps/v1
kind: Deployment
metadata:
  name: reviews-v1
  labels:
    app: reviews
    version: v1
spec:
  replicas: 1
  selector:
    matchLabels:
      app: reviews
      version: v1
  template:
    metadata:
      labels:
        app: reviews
        version: v1
    spec:
      serviceAccountName: bookinfo-reviews
      containers:
      - name: reviews
        image: docker.io/istio/examples-bookinfo-reviews-v1:1.16.2
        imagePullPolicy: IfNotPresent
        env:
        - name: LOG_DIR
          value: "/tmp/logs"
        ports:
        - containerPort: 9080
        volumeMounts:
        - name: tmp
          mountPath: /tmp
        - name: wlp-output
          mountPath: /opt/ibm/wlp/output
        securityContext:
          runAsUser: 1000
      volumes:
      - name: wlp-output
        emptyDir: {}
      - name: tmp
        emptyDir: {}

apiVersion: apps/v1
kind: Deployment
metadata:
  name: reviews-v2
  labels:
    app: reviews
    version: v2
spec:
  replicas: 1
  selector:
    matchLabels:
      app: reviews
      version: v2
  template:
    metadata:
      labels:
        app: reviews
        version: v2
    spec:
      serviceAccountName: bookinfo-reviews
      containers:
      - name: reviews
        image: docker.io/istio/examples-bookinfo-reviews-v2:1.16.2
        imagePullPolicy: IfNotPresent
        env:
        - name: LOG_DIR
          value: "/tmp/logs"
        ports:
        - containerPort: 9080
        volumeMounts:
        - name: tmp
          mountPath: /tmp
        - name: wlp-output
          mountPath: /opt/ibm/wlp/output
        securityContext:
          runAsUser: 1000
      volumes:
      - name: wlp-output
        emptyDir: {}
      - name: tmp
        emptyDir: {}

apiVersion: apps/v1
kind: Deployment
metadata:
  name: reviews-v3
  labels:
    app: reviews
    version: v3
spec:
  replicas: 1
  selector:
    matchLabels:
      app: reviews
      version: v3
  template:
    metadata:
      labels:
        app: reviews
        version: v3
    spec:
      serviceAccountName: bookinfo-reviews
      containers:
      - name: reviews
        image: docker.io/istio/examples-bookinfo-reviews-v3:1.16.2
        imagePullPolicy: IfNotPresent
        env:
        - name: LOG_DIR
          value: "/tmp/logs"
        ports:
        - containerPort: 9080
        volumeMounts:
        - name: tmp
          mountPath: /tmp
        - name: wlp-output
          mountPath: /opt/ibm/wlp/output
        securityContext:
          runAsUser: 1000
      volumes:
      - name: wlp-output
        emptyDir: {}
      - name: tmp
        emptyDir: {}

##################################################################################################
# Productpage services
##################################################################################################
apiVersion: v1
kind: Service
metadata:
  name: productpage
  labels:
    app: productpage
    service: productpage
spec:
  ports:
  - port: 9080
    name: http
  selector:
    app: productpage

apiVersion: v1
kind: ServiceAccount
metadata:
  name: bookinfo-productpage
  labels:
    account: productpage

apiVersion: apps/v1
kind: Deployment
metadata:
  name: productpage-v1
  labels:
    app: productpage
    version: v1
spec:
  replicas: 1
  selector:
    matchLabels:
      app: productpage
      version: v1
  template:
    metadata:
      labels:
        app: productpage
        version: v1
    spec:
      serviceAccountName: bookinfo-productpage
      containers:
      - name: productpage
        image: docker.io/istio/examples-bookinfo-productpage-v1:1.16.2
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 9080
        volumeMounts:
        - name: tmp
          mountPath: /tmp
        securityContext:
          runAsUser: 1000
      volumes:
      - name: tmp
        emptyDir: {}

上記の構成ファイルから以下のアプリケーションがデプロイされます。

f:id:FallenPigeon:20210809183650p:plain

初期構成の確認

ここでkubernetesのnamespaceを確認すると、istio-systemというnamespaceが作成されています。
その他はkubernetesのデフォルトnamespaceになります。特に指定がない場合はアプリケーションポッドはdefault namespaceにデプロイされます。
ちなみにこのnamespaceはlinux kernelのnamespaceとは別物でkubernetes固有の論理単位になります。

$ kubectl get namespaces
NAME              STATUS   AGE
default           Active   4h12m
istio-system      Active   131m
kube-node-lease   Active   4h12m
kube-public       Active   4h12m
kube-system       Active   4h12m

各namespaceにデプロイされたpodを確認します。

$ kubectl get pods -o wide -n istio-system
NAME                                   READY   STATUS    RESTARTS   AGE    IP           NODE           NOMINATED NODE   READINESS GATES
istio-egressgateway-5547fcc8fc-zpd72   1/1     Running   0          136m   10.244.2.4   minikube-m03              
istio-ingressgateway-8f568d595-s68gh   1/1     Running   0          136m   10.244.2.5   minikube-m03              
istiod-568d797f55-6rvp5                1/1     Running   0          136m   10.244.1.3   minikube-m02

istio-system namespaceにはistio-egressgateway(出口)とistio-ingressgateway(入口)がポッドとしてデプロイされています。
Kubernetes cluster外と通信するには、これらのエンドポイントにアクセスすることになります。

#ポッド一覧
$ kubectl get pods -o wide -n default
NAME                              READY   STATUS    RESTARTS   AGE    IP            NODE           NOMINATED NODE   READINESS GATES
details-v1-79f774bdb9-dxbmb       2/2     Running   0          130m   10.244.1.7    minikube-m02              
productpage-v1-6b746f74dc-mt5sn   2/2     Running   0          130m   10.244.1.9    minikube-m02              
ratings-v1-b6994bb9-9gmdq         2/2     Running   0          130m   10.244.2.9    minikube-m03              
reviews-v1-545db77b95-zhr6v       2/2     Running   0          130m   10.244.1.8    minikube-m02              
reviews-v2-7bf8c9648f-4n9c6       2/2     Running   0          130m   10.244.0.3    minikube                  
reviews-v3-84779c7bbc-8trsh       2/2     Running   0          130m   10.244.2.10   minikube-m03              

#ポッドの詳細情報
$ kubectl describe pod productpage-v1-6b746f74dc-mt5sn
Name:         productpage-v1-6b746f74dc-mt5sn
Namespace:    default
Priority:     0
Node:         minikube-m02/192.168.99.101
Start Time:   Mon, 09 Aug 2021 16:11:35 +0900
Labels:       app=productpage
...
Containers:
  productpage:
    Container ID:   docker://25ae5f0dbd0984a2d8a675ffbe71592279ce936215b12c417aa06d543e201919
    Image:          docker.io/istio/examples-bookinfo-productpage-v1:1.16.2
    Image ID:       docker-pullable://istio/examples-bookinfo-productpage-v1@sha256:63ac3b4fb6c3ba395f5d044b0e10bae513afb34b9b7d862b3a7c3de7e0686667
    Port:           9080/TCP
    Host Port:      0/TCP
    State:          Running
      Started:      Mon, 09 Aug 2021 16:11:36 +0900
    Ready:          True
    Restart Count:  0
    Environment:    
    Mounts:
      /tmp from tmp (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-f7hl9 (ro)
  istio-proxy:
    Container ID:  docker://f48cac26a0bcce1c056044843c44389ec3cb7d55e477d7a39014b8c0746c9b61
    Image:         docker.io/istio/proxyv2:1.10.3
    Image ID:      docker-pullable://istio/proxyv2@sha256:a78b7a165744384d95f75d157c34e02d6b4355aaf8fe2a2c75914832bdf764e8
    Port:          15090/TCP
    Host Port:     0/TCP
    Args:
      proxy
      sidecar
      --domain
      $(POD_NAMESPACE).svc.cluster.local
      --serviceCluster
      productpage.$(POD_NAMESPACE)
      --proxyLogLevel=warning
      --proxyComponentLogLevel=misc:error
      --log_output_level=default:info
      --concurrency
      2
    State:          Running
      Started:      Mon, 09 Aug 2021 16:11:37 +0900
...

default namespaceには、アプリケーションを構成するDetails、Ratings、Reviews、Productpageのポッドが配置されています。
さらに、productpage-v1-6b746f74dc-mt5snポッドの構成情報を確認すると、productpageコンテナとistio-proxyコンテナの２つのコンテナが動作しています。
productpageコンテナがアプリケーション本体でistio-proxyコンテナはポッド間ルーティングを行うプロキシになります。つまり、Istioアプリケーションポッドの通信はすべてistio-proxyを経由します。

次にDetails(Pod)、Ratings(Pod)、Reviews(Pod)、Productpage(Pod)に割り当てられたserviceも確認します。
エンドポイントのClusterIPはVMのIPと異なることがわかります。これらを紐付けるのが後述するistio-egressgateway(出口)とistio-ingressgateway(入口)になります。

$ kubectl get services -n default
NAME          TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)    AGE
details       ClusterIP   10.102.12.28             9080/TCP   3h7m
kubernetes    ClusterIP   10.96.0.1                443/TCP    5h14m
productpage   ClusterIP   10.103.127.69            9080/TCP   3h7m
ratings       ClusterIP   10.102.176.254           9080/TCP   3h7m
reviews       ClusterIP   10.109.191.110           9080/TCP   3h7m

#ノード情報を確認
$ kubectl get nodes -o wide
NAME           STATUS   ROLES                  AGE     VERSION   INTERNAL-IP      EXTERNAL-IP   OS-IMAGE               KERNEL-VERSION   CONTAINER-RUNTIME
minikube       Ready    control-plane,master   3h48m   v1.21.2   192.168.99.100           Buildroot 2020.02.12   4.19.182         docker://20.10.6
minikube-m02   Ready                     3h21m   v1.21.2   192.168.99.101           Buildroot 2020.02.12   4.19.182         docker://20.10.6
minikube-m03   Ready                     3h      v1.21.2   192.168.99.102           Buildroot 2020.02.12   4.19.182         docker://20.10.6

外部公開

デプロイしたアプリケーションService(Pod)をIstio gatewayに関連付けます。
これは

#紐付け
$ kubectl apply -f samples/bookinfo/networking/bookinfo-gateway.yaml

#構成ファイルの内容
$ less samples/bookinfo/networking/bookinfo-gateway.yaml
apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
  name: bookinfo-gateway
spec:
  selector:
    istio: ingressgateway # use istio default controller
  servers:
  - port:
      number: 80
      name: http
      protocol: HTTP
    hosts:
    - "*"

apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: bookinfo
spec:
  hosts:
  - "*"
  gateways:
  - bookinfo-gateway
  http:
  - match:
    - uri:
        exact: /productpage
    - uri:
        prefix: /static
    - uri:
        exact: /login
    - uri:
        exact: /logout
    - uri:
        prefix: /api/v1/products
    route:
    - destination:
        host: productpage
        port:
          number: 9080

istio-ingressgatewayの設定を確認すると、http2のポートが30800、httpsのポートが31633であることが分かります。
これは、node IP:30800にアクセスすると、(istio-ingressgatewayポッドの)コンテナの8080に転送されることを表します。

$ kubectl -n istio-system get service istio-ingressgateway -o json
...
    "spec": {
        "clusterIP": "10.106.165.119",
        "clusterIPs": [
            "10.106.165.119"
        ],
        "externalTrafficPolicy": "Cluster",
        "ipFamilies": [
            "IPv4"
        ],
        "ipFamilyPolicy": "SingleStack",
        "ports": [
            {
                "name": "status-port",
                "nodePort": 31759,
                "port": 15021,
                "protocol": "TCP",
                "targetPort": 15021
            },
            {
                "name": "http2",
                "nodePort": 30800,
                "port": 80,
                "protocol": "TCP",
                "targetPort": 8080
            },
            {
                "name": "https",
                "nodePort": 31633,
                "port": 443,
                "protocol": "TCP",
                "targetPort": 8443
            },
            {
                "name": "tcp",
                "nodePort": 32007,
                "port": 31400,
                "protocol": "TCP",
                "targetPort": 31400
            },
            {
                "name": "tls",
                "nodePort": 30666,
                "port": 15443,
                "protocol": "TCP",
                "targetPort": 15443
            }
        ],
        "selector": {
            "app": "istio-ingressgateway",
            "istio": "ingressgateway"
        },
        "sessionAffinity": "None",
        "type": "LoadBalancer"
...

続いてistio-ingressgatewayポッドのコンテナを確認します。
確かに8080で待ち受けています。このコンテナがistio-ingressgatewayの本体(Envoy)です。

$ kubectl describe pod -n istio-system istio-ingressgateway-8f568d595-s68gh
Name:         istio-ingressgateway-8f568d595-s68gh
Namespace:    istio-system
Priority:     0
Node:         minikube-m03/192.168.99.102
Start Time:   Mon, 09 Aug 2021 16:06:47 +0900
Labels:       app=istio-ingressgateway
...
Status:       Running
IP:           10.244.2.5
IPs:
  IP:           10.244.2.5
Controlled By:  ReplicaSet/istio-ingressgateway-8f568d595
Containers:
  istio-proxy:
    Container ID:  docker://1396e222cd2aa759fff144ef299f4a19910b47ea4891e8d22caee5ad9e973ee8
    Image:         docker.io/istio/proxyv2:1.10.3
    Image ID:      docker-pullable://istio/proxyv2@sha256:a78b7a165744384d95f75d157c34e02d6b4355aaf8fe2a2c75914832bdf764e8
    Ports:         15021/TCP, 8080/TCP, 8443/TCP, 31400/TCP, 15443/TCP, 15090/TCP
    Host Ports:    0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP
...

サイドカープロキシの設定にはlistener(受信),router(紐付け),cluster(送信),endpoint(ノード)があります。
まずistio-ingressgatewayの受信ポートを確認します。
8080で受信したトラフィックの送信先がRoute: http.80となっています。
補足ですがingressgatewaysの設定で8080が80にマップされています。
つまり、ingressgatewaysポッドが8080で受信したトラフィックは80に転送される形になります。

$ istioctl proxy-config listener istio-ingressgateway-8f568d595-s68gh -n istio-system
ADDRESS PORT  MATCH DESTINATION
0.0.0.0 8080  ALL   Route: http.80
0.0.0.0 15021 ALL   Inline Route: /healthz/ready*
0.0.0.0 15090 ALL   Inline Route: /stats/prometheus*

    ingressGateways:
    - name: istio-ingressgateway
      enabled: true
      k8s:
        resources:
          requests:
            cpu: 10m
            memory: 40Mi
        service:
          ports:
            ## You can add custom gateway ports in user values overrides, but it must include those ports since helm replaces.
            # Note that AWS ELB will by default perform health checks on the first port
            # on this list. Setting this to the health check port will ensure that health
            # checks always work. https://github.com/istio/istio/issues/12503
            - port: 15021
              targetPort: 15021
              name: status-port
            - port: 80
              targetPort: 8080
              name: http2
            - port: 443
              targetPort: 8443
              name: https
            - port: 31400
              targetPort: 31400
              name: tcp
              # This is the port where sni routing happens
            - port: 15443
              targetPort: 15443
              name: tls

次にrouteを確認するとhttp.80がVIRTUAL SERVICEに転送されています。
つまりVIRTUAL SERVICE(80→9080)によりproductpageサービスノードの9080に転送されます。

$ istioctl proxy-config route istio-ingressgateway-8f568d595-s68gh -n istio-system
NAME        DOMAINS     MATCH                  VIRTUAL SERVICE
http.80     *           /productpage           bookinfo.default
http.80     *           /static*               bookinfo.default
http.80     *           /login                 bookinfo.default
http.80     *           /logout                bookinfo.default
http.80     *           /api/v1/products*      bookinfo.default
            *           /stats/prometheus*     
            *           /healthz/ready*

VIRTUAL SERVICEによってproductpageサービスノードの9080にトラフィックは到達しますが、直接アプリケーションポッドには転送されません。

productpageポッドのサイドカープロキシの設定を確認すると、15001と15006の受信ポートがあります。
すべての送受信トラフィックはiptablesの設定で送信時はプロキシの15001、受信時はプロキシの15006にリダイレクションされます。
今回は15006 Trans: raw_buffer; Addr: *:9080に該当するため、プロキシからproductpageの9080に転送されます。
これでやっとアプリケーションポッドにリクエストが到達します。

https://speakerdeck.com/110y/tour-of-istio?slide=39

$ istioctl proxy-config listener productpage-v1-6b746f74dc-mt5sn
...
0.0.0.0       15001 ALL                                                                                             PassthroughCluster
0.0.0.0       15001 Addr: *:15001                                                                                   Non-HTTP/Non-TCP
0.0.0.0       15006 Addr: *:15006                                                                                   Non-HTTP/Non-TCP
0.0.0.0       15006 Trans: tls; App: istio-http/1.0,istio-http/1.1,istio-h2; Addr: 0.0.0.0/0                        InboundPassthroughClusterIpv4
0.0.0.0       15006 Trans: raw_buffer; App: HTTP; Addr: 0.0.0.0/0                                                   InboundPassthroughClusterIpv4
0.0.0.0       15006 Trans: tls; App: TCP TLS; Addr: 0.0.0.0/0                                                       InboundPassthroughClusterIpv4
0.0.0.0       15006 Trans: raw_buffer; Addr: 0.0.0.0/0                                                              InboundPassthroughClusterIpv4
0.0.0.0       15006 Trans: tls; Addr: 0.0.0.0/0                                                                     InboundPassthroughClusterIpv4
0.0.0.0       15006 Trans: tls; App: istio,istio-peer-exchange,istio-http/1.0,istio-http/1.1,istio-h2; Addr: *:9080 Cluster: inbound|9080||
0.0.0.0       15006 Trans: raw_buffer; Addr: *:9080                                                                 Cluster: inbound|9080||
...

$ istioctl proxy-config route productpage-v1-6b746f74dc-mt5sn
NAME                                                          DOMAINS                               MATCH                  VIRTUAL SERVICE
kube-dns.kube-system.svc.cluster.local:9153                   kube-dns.kube-system                  /*                     
80                                                            istio-egressgateway.istio-system      /*                     
80                                                            istio-ingressgateway.istio-system     /*                     
9080                                                          details                               /*                     
9080                                                          productpage                           /*                     
9080                                                          ratings                               /*                     
9080                                                          reviews                               /*                     
15010                                                         istiod.istio-system                   /*                     
15014                                                         istiod.istio-system                   /*                     
istio-ingressgateway.istio-system.svc.cluster.local:15021     istio-ingressgateway.istio-system     /*                     
                                                              *                                     /stats/prometheus*     
InboundPassthroughClusterIpv4                                 *                                     /*                     
inbound|9080||                                                *                                     /*                     
inbound|9080||                                                *                                     /* 


$ istioctl proxy-config cluster productpage-v1-6b746f74dc-k5lwk
SERVICE FQDN                                            PORT      SUBSET          DIRECTION     TYPE             DESTINATION RULE
                                                        9080      -               inbound       ORIGINAL_DST

これで透過型プロキシを経由したルーティングを確認しました。
ブラウザで192.168.99.100:30800(nodeport)にアクセスすると以下のようなサイトが表示されます。

f:id:FallenPigeon:20210809214650p:plain