overview

In the third post we completed a sample end to end analytics workflow using python, ml flow databricks and a few basic aws tools. However, because databricks is spark based,

recalling part 3 limitations

part 4 - preferred stack

  • python + dask - language of choice; package for improving compute
  • github - version control
  • github actions - ci
  • s3 - distributed storage
  • ec2 - distributed compute
  • flask - python api framework
  • mlflow - ml lifecycle management
  • docker - container service
  • power bi - visualization

general flow

rationale for changes”

  • automated model integration

known limitations

notes