Cloud-Native Satellite Ground System

PUBLISHED ON JAN 1, 2019 — INDUSTRY, THE AEROSPACE CORPORATION

Purpose

At The Aerospace Corporation, I helped build a prototype cloud platform that utilized only open source, cloud-native technologies. The prototype was designed to be a pathfinder for migrating a legacy, vendor-locked satellite ground system to a vendor-agnostic, cloud platform. The completion of this project provided two key insights to the customer. First, one can build a complex, multilayered platform for data-intensive space applications without the use of propriety software. Second, one can utilize a number of cutting edge cloud-native technologies to fundamentally enhance a legacy platform. Of course, this requires far more insight than a simple drag-and-drop onto the cloud.

Layers

From a 10,000-foot view, the following layers make up the architecture:

Open AF baby

Our team built an on-premise cloud running virtual machines provisioned with OpenStack. Kubernetes (K8s) was used as the underlying platform deployed within the VMs. Hadoop was the data foundation, which HBase used. Apache Drill was the query engine for all underlying data. Applications were deployed as Docker containers (pods) within K8s after undergoing a DevSecOps CD/CI pipeline using Jenkins. Finally, the stack was monitored and observed using Prometheus, Grafana, and Istio.

The Data Layer

I was first tasked with building out the data layer of the architecture. This involved the following:

  • Containerize/deploy a Hadoop cluster on K8s
  • Containerize/deploy a HBase cluster on K8s
  • Containerize/deploy a Drill cluster on K8s

A Data Fiasco

Once that was completed, I cleaned and ingested a 1.2 TB satellite telemetry data set onto HBase. The dataset had on the order of 109 rows and 104 columns. This required a strong understanding of efficient, column-oriented table design in HBase. Since HBase provides column-family data locality, I tracked down the telemetry documentation, parsed it, and created families accordingly. I then created a parallelized data ingester and deployed it as a worker pool in K8s.

Demo Days

Finally, I designed a set of SQL-like queries in Drill to efficiently analyze (e.g. avg, std, etc.) and retrieve the telemetry data from HBase. To help with the presentability of multiple milestone demonstrations given to the customer, I created an interactive, web-based visualization tool that issues Drill queries and displays the results to the user.

Technologies

  • HBase
  • Hadoop
  • Drill
  • Docker
  • Kubernetes: PVs, PVCs, Services, DaemonSets, etc.
  • Python: bokeh, happybase, pydrill
comments powered by Disqus