Open to opportunities

Cáp Hữu Tú

Cáp Hữu Tú profile photo

DevOps / System Engineer (Student)

Systems and DevOps-oriented 3rd-year student with hands-on experience in networking, distributed systems, and cloud infrastructure. Experience building and deploying scalable environments using Docker, Kubernetes, and CI/CD pipelines, with a practical focus on Linux administration, monitoring, and system troubleshooting.

Ho Chi Minh City, Vietnam+84 345474746

Skills

Hands-on technologies and tooling

Cloud & DevOps
AWSGCPCloudflareGitHub ActionsDockerKubernetesTerraform (basic)AnsibleArgoCD
Systems Administration
Linux (Ubuntu/CentOS)Windows ServerVirtualization (VMware Workstation, KVM)DNSDHCPActive Directory (AD DS)Web Servers (Nginx, Apache)
Networking
Routing (Static, OSPF, Basic BGP, Basic MPLS)VLANNATACLVPN (WireGuard)Network simulation (GNS3, VMware)
Monitoring & Security
PrometheusGrafanaLokiCIS BenchmarkZero TrustSSL/TLS Management
Programming
PythonBash/Shell ScriptingC#C++
Languages
English (IELTS 6.5)Vietnamese (Native)
Soft Skills
Problem-solvingProactive teamwork

Core Projects

6 infrastructure & DevOps projects

Healthcare Infrastructure Setup / DevSecOps

In Progress
Infrastructure Engineer (Team Project)Team Project03/2026 - Present

Designed and deployed a hybrid-cloud infrastructure (AWS + OpenStack private cloud) for a digital healthcare platform, integrating full DevSecOps practices including CI/CD security gates, GitOps, SIEM, and comprehensive observability.

Responsibilities

  • Provisioned K3s Kubernetes cluster on OpenStack (3 nodes) and managed AWS resources (ALB, EC2, S3, CloudFront) using Terraform with daily drift detection
  • Built DevSecOps CI/CD pipelines (GitHub Actions) with Trivy; reduced 47 Trivy vulnerabilities through pipeline-enforced remediation; deployed observability stack (Prometheus, Grafana, Loki) and Wazuh SIEM with Telegram alerting across hybrid nodes.
  • Implemented ArgoCD GitOps with manual sync policy and enforced immutable image tags in production overlays
  • Bridged AWS and OpenStack networks via WireGuard VPN tunnel; exposed internal services through Cloudflare Tunnel without public IPs
  • Deployed kube-prometheus-stack (Prometheus, Grafana, Alertmanager) with hybrid node scraping across K3s and AWS; Loki + Promtail for centralized log aggregation; Telegram alerting for critical and warning events
  • Deployed Wazuh SIEM with 6 agents across all nodes, integrated AWS CloudTrail S3 logs, and wrote custom Telegram alert integration
  • Automated infrastructure auditing with Ansible and dynamic inventory from Terraform outputs; configured double ProxyJump SSH for private AWS nodes

Architecture

Frontend:  User → Cloudflare DNS → CloudFront → S3 (static)
API:       User → ALB → EC2 Backend (HA x2) → WireGuard VPN
              → OpenStack K3s → Database / OCR Service
Internal:  User → Cloudflare Tunnel → K3s Ingress (OpenStack)
              → Grafana / Prometheus / ArgoCD / Wazuh
AWS (ALB, EC2, S3, CloudFront, SSM, CloudTrail)OpenStackK3s / KubernetesDockerWireGuardCloudflare TunnelGitHub ActionsArgoCDTerraformAnsiblePrometheus / Grafana / Loki / Promtail / AlertmanagerWazuh SIEM - Trivy / SonarQube

CIS GKE Security Audit

Security / System Engineer (Project)Team Project04/2026 - 05/2026

Evaluating and improving the security posture of a GKE cluster based on CIS (Center for Internet Security) Benchmarks.

Responsibilities

  • Designed an auditing framework to evaluate GKE cluster security based on CIS Benchmarks, covering 31 security recommendations across control plane, node, and workload configurations.
  • Developing and integrating modular audit scripts
  • Validating security configurations to identify misconfigurations
  • Applying remediation steps to harden the cluster environment
GCPKubernetes (GKE)CIS BenchmarksShell Scripting

Gout-LLMOps: Medical LLM Evaluation Automation

DevOps & LLMOps EngineerTeam Project03/2026 - 05/2026

Developed an end-to-end LLMOps pipeline to automate the evaluation of medical Large Language Models (LLMs), ensuring adherence to clinical standards and response reliability.

Responsibilities

  • Contributed to building a CI/CD/CE pipeline using GitHub Actions to automate Docker builds and evaluation triggers
  • Managed GitOps-based continuous deployment with ArgoCD for Kubernetes workloads
  • Integrated "LLM-as-a-Judge" framework with GPT-4 and RAGAS for quantitative scoring of faithfulness and safety
  • Implemented monitoring and logging stack (Prometheus, Grafana, Loki) with custom dashboards for real-time metrics
KubernetesGitHub ActionsArgoCDDockerGPT-4 / RAGASPrometheus / Grafana / LokiVector Database (RAG)

RL-based Inference Offloading for Medical LLMs

AI Infrastructure EngineerTeam Project03/2026 - 04/2026

Optimized inference delivery for medical LLMs across Edge-Cloud architectures using Reinforcement Learning to dynamically balance latency and computational cost.

Responsibilities

  • Constructed an Edge-Cloud testbed (Edge, Cloud, Gateway) using multi-node Docker environments
  • Deployed medical LLMs (PhoGPT, VinaLLaMA) using Ollama and unified API communication layers
  • Modeled inference offloading as an MDP and trained RL agents (PPO/DQN) to optimize decisions
  • Simulated real-world network constraints (latency, packet loss) using Linux tc and netem
Reinforcement Learning (DQN)Docker / OllamaLinux tc/netemPython (API & RL)Edge-Cloud Computing (Lab Environment)

Distributed Data Processing Infrastructure

System Administrator (Project)Team Project10/2025 - 11/2025

Provisioned and operated a distributed Apache Spark cluster on GCP to support large-scale data processing workloads.

Responsibilities

  • Provisioned cloud infrastructure and networking on GCP
  • Configured multi-node distributed Spark cluster
  • Monitored system resources and optimized cluster configurations
Impact

dataset size: 15gb, node: 3, processing time: around 12 minutes

GCPApache SparkLinux

Network Performance Evaluation

Network / System Engineer (Project)Team Project10/2025 - 11/2025

Evaluated and benchmarked network performance across virtualized and containerized environments.

Responsibilities

  • Designed testing topologies using VMware Workstation and Docker/Kubernetes
  • Built and managed virtualized lab environments with VMware Workstation for isolated network testing
  • Wrote python script to auto-measure throughput, latency, and packet loss using iperf3 and netem
  • Analyzed system logs and compared performance metrics
VMware WorkstationDockerKubernetesiperf3 / tc / netem

Additional Projects

Broader scope of work and experiments

Real-time LAN Application

09/2024 - 10/2024

Built a real-time communication app utilizing socket programming.

C#Socket Programming

Course Management Mobile App

09/2025 - 12/2025

Developed a mobile application with backend database integration.

JavaPostgreSQL

Smart Parking IoT System

03/2025 - 05/2025

Prototyped an IoT-based automated parking system.

C++ArduinoESP8266

Education & Lab Experience

Academic background and hands-on lab work

University of Information Technology, VNU-HCM

Computer Networks and Data Communications

GPA 3.35/4.0
Expected 2027

Relevant Coursework

Computer NetworksDistributed SystemsCloud ComputingNetwork SecurityOperating Systems

Lab & Self-taught Experience

  • AWS Academy Cloud Foundations
  • Built and managed virtual lab environments using VMware Workstation and KVM
  • Hands-on with Ansible for automated provisioning and configuration management
  • Configured enterprise network services (DNS, DHCP, AD DS)
  • Implemented firewall rules, ACLs, and VPN tunnels
  • Managed OpenStack resources and virtualized networking
  • Deployed and scaled Kubernetes clusters with ArgoCD-based GitOps workflows
  • Set up monitoring and logging pipelines using Prometheus, Grafana, and Loki
  • Explored distributed architectures (Kafka, DFS)

Achievements

Competitions and recognition

1st Place in Cluster - NET Challenge 2025 (11/2025)

Honorable Mention - NET Challenge 2026 (6/2026) - An academic competition on Computer Networks, System Administration (Windows/Linux), and Cloud Computing, organized by the Faculty of Computer Networks and Communications, UIT VNU-HCM

Contact

Let's connect

Built with Next.js · Deployed on Cloudflare Pages · portfolio.htsnov.com