Go offline with the Player FM app!
Explorer: Data Frames in Elixir with Chris Grainger
Manage episode 496244528 series 2493466
In this episode of Elixir Wizards, Charles Suggs sits down with Chris Grainger, co-founder and CTO of Amplified and creator of the Explorer library. Chris explains how Explorer brings the familiar data-frame workflows of R’s dplyr and Python’s pandas into the Elixir world. We explore (pun intended!) how Explorer integrates with Ecto, Nx, and LiveView to build end-to-end data pipelines without leaving the BEAM, and how features like lazy evaluation and distributed frames let you tackle large datasets.
Whether you’re generating reports or driving interactive charts in LiveView, Explorer makes tabular data accessible to every Elixir developer. We wrap up by looking ahead to SQL-style backends, ADBC connectivity, and other features on the Explorer roadmap.
Key topics discussed in this episode:
- dplyr- and pandas-inspired data manipulation in Elixir
- Polars integration via Rust NIFs for blazing performance
- Immutable data frames and BEAM-friendly concurrency
- Lazy evaluation to work with arbitrarily large tables
- Distributed data-frame support for multi-node processing
- Seamless integration with Ecto schemas and queries
- Zero-copy interoperability between Explorer and Nx tensors
- Apache Arrow and ADBC protocols for cross-language I/O
- Exploring SQL-style backends for remote query execution
- Building interactive dashboards and charts in LiveView
- Consolidating ETL workflows into a single Elixir API
- Streaming data pipelines for memory-efficient processing
- Tidy data principles and behavior-based API design
- Real-world use cases: report generation, patent analysis, and more
- Future roadmap: new backends, query optimizations, and community plugins
Links mentioned:
https://hexdocs.pm/explorer/Explorer.html
https://www.amplified.ai/
https://www.r-project.org/
https://vita.had.co.nz/papers/tidy-data.pdf
https://www.tidyverse.org/
https://www.python.org/
https://dplyr.tidyverse.org/
https://go.dev/
https://hexdocs.pm/nx/Nx.html
https://github.com/pola-rs/polars
https://github.com/rusterlium/rustler
https://www.rust-lang.org/
https://www.postgresql.org/
https://hexdocs.pm/ecto/Ecto.html
https://www.elastic.co/elasticsearch
https://arrow.apache.org/
Chris Grainger & Chris McCord Keynote ElixirConf 2024: https://youtu.be/4qoHPh0obv0
https://dbplyr.tidyverse.org/
https://spark.posit.co/
https://hexdocs.pm/pythonx/Pythonx.html
https://hexdocs.pm/vega_lite/VegaLite.html
10 Minutes to Explorer: https://hexdocs.pm/explorer/exploring_explorer.html
https://github.com/elixir-nx/scholar
https://scikit-learn.org/stable/
https://github.com/cigrainger
https://erlef.org/slack-invite/erlef
https://bsky.app/profile/cigrainger.bsky.social
https://github.com/cigrainger
199 episodes
Manage episode 496244528 series 2493466
In this episode of Elixir Wizards, Charles Suggs sits down with Chris Grainger, co-founder and CTO of Amplified and creator of the Explorer library. Chris explains how Explorer brings the familiar data-frame workflows of R’s dplyr and Python’s pandas into the Elixir world. We explore (pun intended!) how Explorer integrates with Ecto, Nx, and LiveView to build end-to-end data pipelines without leaving the BEAM, and how features like lazy evaluation and distributed frames let you tackle large datasets.
Whether you’re generating reports or driving interactive charts in LiveView, Explorer makes tabular data accessible to every Elixir developer. We wrap up by looking ahead to SQL-style backends, ADBC connectivity, and other features on the Explorer roadmap.
Key topics discussed in this episode:
- dplyr- and pandas-inspired data manipulation in Elixir
- Polars integration via Rust NIFs for blazing performance
- Immutable data frames and BEAM-friendly concurrency
- Lazy evaluation to work with arbitrarily large tables
- Distributed data-frame support for multi-node processing
- Seamless integration with Ecto schemas and queries
- Zero-copy interoperability between Explorer and Nx tensors
- Apache Arrow and ADBC protocols for cross-language I/O
- Exploring SQL-style backends for remote query execution
- Building interactive dashboards and charts in LiveView
- Consolidating ETL workflows into a single Elixir API
- Streaming data pipelines for memory-efficient processing
- Tidy data principles and behavior-based API design
- Real-world use cases: report generation, patent analysis, and more
- Future roadmap: new backends, query optimizations, and community plugins
Links mentioned:
https://hexdocs.pm/explorer/Explorer.html
https://www.amplified.ai/
https://www.r-project.org/
https://vita.had.co.nz/papers/tidy-data.pdf
https://www.tidyverse.org/
https://www.python.org/
https://dplyr.tidyverse.org/
https://go.dev/
https://hexdocs.pm/nx/Nx.html
https://github.com/pola-rs/polars
https://github.com/rusterlium/rustler
https://www.rust-lang.org/
https://www.postgresql.org/
https://hexdocs.pm/ecto/Ecto.html
https://www.elastic.co/elasticsearch
https://arrow.apache.org/
Chris Grainger & Chris McCord Keynote ElixirConf 2024: https://youtu.be/4qoHPh0obv0
https://dbplyr.tidyverse.org/
https://spark.posit.co/
https://hexdocs.pm/pythonx/Pythonx.html
https://hexdocs.pm/vega_lite/VegaLite.html
10 Minutes to Explorer: https://hexdocs.pm/explorer/exploring_explorer.html
https://github.com/elixir-nx/scholar
https://scikit-learn.org/stable/
https://github.com/cigrainger
https://erlef.org/slack-invite/erlef
https://bsky.app/profile/cigrainger.bsky.social
https://github.com/cigrainger
199 episodes
All episodes
×Welcome to Player FM!
Player FM is scanning the web for high-quality podcasts for you to enjoy right now. It's the best podcast app and works on Android, iPhone, and the web. Signup to sync subscriptions across devices.