Search a title or topic

Over 20 million podcasts, powered by 

Player FM logo
Artwork

Content provided by OCDevel. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by OCDevel or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://podcastplayer.com/legal.
Player FM - Podcast App
Go offline with the Player FM app!

MLA 021 Databricks

26:00
 
Share
 

Manage episode 332260472 series 1457335
Content provided by OCDevel. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by OCDevel or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://podcastplayer.com/legal.

Try a walking desk to stay healthy while you study or work!

Full notes at ocdevel.com/mlg/mla-21

  • Raybeam and Databricks: Ming Chang from Raybeam discusses Raybeam's focus on data science and analytics, and how their recent acquisition by Dept Agency has expanded their scope into ML Ops and AI. Raybeam often utilizes Databricks due to its comprehensive nature.

  • Understanding Databricks: Contrary to initial assumptions, Databricks is not just an analytics platform like Tableau but an ML Ops platform competing with tools like SageMaker and Kubeflow. It offers functionalities for creating notebooks, executing Python code, and using a hosted Spark cluster and Delta Lake for data storage.

  • Choosing the Right MLOps Tool: Depending on client requirements, Raybeam might recommend different tools. Decision factors include client's existing expertise, infrastructure needs, and scaling challenges. Databricks is often recommended for its ease of use and features.

  • Databricks Features: Offers a hosted solution for Spark clusters on AWS, Azure, or GCP; integrates with IDEs like VSCode through Databricks Connect; provides a unique Git integration for version control of notebooks; and utilizes Delta Lake for version control of Parquet files, enhancing operations like edit and delete.

  • Parquet and Delta Lake: Parquet files are optimized for big data, and Delta Lake provides transaction-like operations over Parquet by maintaining version history.

  • Pricing and Usage: Databricks adds a nominal fee on top of cloud provider charges. It's accessible for single developers and startups, making it suitable for various scales of operations.

  • Ming Chang's Picks: Discusses interests in automated stock trading projects and building drones with Raspberry Pi, highlighting the intersection of programming and physical computing.

Additional Resources

For a hands-on look at Ming Chang's drone project, follow his developments or connect for insights on building a Raspberry Pi-powered drone.

  continue reading

57 episodes

Artwork

MLA 021 Databricks

Machine Learning Guide

591 subscribers

published

iconShare
 
Manage episode 332260472 series 1457335
Content provided by OCDevel. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by OCDevel or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://podcastplayer.com/legal.

Try a walking desk to stay healthy while you study or work!

Full notes at ocdevel.com/mlg/mla-21

  • Raybeam and Databricks: Ming Chang from Raybeam discusses Raybeam's focus on data science and analytics, and how their recent acquisition by Dept Agency has expanded their scope into ML Ops and AI. Raybeam often utilizes Databricks due to its comprehensive nature.

  • Understanding Databricks: Contrary to initial assumptions, Databricks is not just an analytics platform like Tableau but an ML Ops platform competing with tools like SageMaker and Kubeflow. It offers functionalities for creating notebooks, executing Python code, and using a hosted Spark cluster and Delta Lake for data storage.

  • Choosing the Right MLOps Tool: Depending on client requirements, Raybeam might recommend different tools. Decision factors include client's existing expertise, infrastructure needs, and scaling challenges. Databricks is often recommended for its ease of use and features.

  • Databricks Features: Offers a hosted solution for Spark clusters on AWS, Azure, or GCP; integrates with IDEs like VSCode through Databricks Connect; provides a unique Git integration for version control of notebooks; and utilizes Delta Lake for version control of Parquet files, enhancing operations like edit and delete.

  • Parquet and Delta Lake: Parquet files are optimized for big data, and Delta Lake provides transaction-like operations over Parquet by maintaining version history.

  • Pricing and Usage: Databricks adds a nominal fee on top of cloud provider charges. It's accessible for single developers and startups, making it suitable for various scales of operations.

  • Ming Chang's Picks: Discusses interests in automated stock trading projects and building drones with Raspberry Pi, highlighting the intersection of programming and physical computing.

Additional Resources

For a hands-on look at Ming Chang's drone project, follow his developments or connect for insights on building a Raspberry Pi-powered drone.

  continue reading

57 episodes

All episodes

×
 
Loading …

Welcome to Player FM!

Player FM is scanning the web for high-quality podcasts for you to enjoy right now. It's the best podcast app and works on Android, iPhone, and the web. Signup to sync subscriptions across devices.

 

Listen to this show while you explore
Play