3 DAYS • APRIL 22-24
WHERE DATA MEETS INTELLIGENCE
Experience 3 DAYS of no bullsh*t technical talks & awesome networking with the brightest minds in data & AI in Oakland, CA.
Speakers From

.png?width=160&height=35&name=Group%2082%20(1).png)
.png?width=184&height=44&name=SNO-SnowflakeLogo_white(1).png)




.png?width=160&height=35&name=Group%2082%20(1).png)
.png?width=184&height=44&name=SNO-SnowflakeLogo_white(1).png)




.png?width=160&height=35&name=Group%2082%20(1).png)
.png?width=184&height=44&name=SNO-SnowflakeLogo_white(1).png)

Every AI breakthrough starts with data. We’re the premier technical event spotlighting cutting-edge AI and the data stack that powers it.















JOIN YOUR TRIBE
Our attendees are AI engineers, founders, CTOs, AI researchers, Heads of Data, and investors who are all building the future of data.
Days
Technical Attendees
Deep-Dive Talks
Featured Keynotes

Naveen Rao
VP of AI
Databricks

Denis Yarats
Co-Founder & CTO
Perplexity

Aaron Katz
Co-Founder & CEO
Clickhouse

Martin Casado
General Partner
a16z

Sharon Zhou
Founder & CEO
Lamini

Michele Catasta
President
Replit

Jake Brill
Head of Product - Integrity
OpenAI
.jpeg)
Rachad Alao
Senior Engineering Director
Meta

Julien Le Dem
Principal Engineer
Datadog

Joseph Gonzalez
Professor
RunLLM & UC Berkeley

Krishnaram Kenthapadi
Chief Scientist, Clinical AI
Oracle Health

George Mathew
Managing Director
Insight Partners
.webp)
Daniel Olmedilla
Distinguished Engineer, AI & Trust

Sumti Jairath
Chief Architect
SambaNova Systems
Featured Keynotes

- Keynotes

- Keynotes

- Keynotes
Join us for an authentic fireside chat that cuts through industry hype to explore how real-time data infrastructure is transforming analytics and AI. We'll examine technical hurdles in processing data at scale in real-time and the architectural decisions enabling performance breakthroughs. The discussion covers data processing evolution, performance bottlenecks in distributed systems, observability innovations, and approaches for maintaining consistency while increasing throughput. Gain insights
...
- Keynotes
Join us for an authentic fireside chat that cuts through industry hype to explore how real-time data infrastructure is transforming analytics and AI. We'll examine technical hurdles in processing data at scale in real-time and the architectural decisions enabling performance breakthroughs. The discussion covers data processing evolution, performance bottlenecks in distributed systems, observability innovations, and approaches for maintaining consistency while increasing throughput. Gain insights
...
- Keynotes

- Keynotes

- Keynotes
.jpeg)
- Keynotes

- Keynotes
Julien Le Dem is a Principal Engineer at Datadog, serves as an officer of the ASF and is a member of the LFAI&Data Technical Advisory Council. He co-created the Parquet, Arrow and OpenLineage open source projects and is involved in several others. His career leadership began in Data Platforms at Yahoo! - where he received his Hadoop initiation - then continued at Twitter, Dremio and WeWork. He then co-founded Datakin (acquired by Astronomer) to solve Data Observability. His French accent makes his talks particularly attractive.

- GenAI Applications
- Keynotes

- Keynotes
Krishnaram Kenthapadi is the Chief Scientist, Clinical AI at Oracle Health, where he leads the AI initiatives for Clinical Digital Assistant and other Oracle Health products. Previously, as the Chief AI Officer & Chief Scientist of Fiddler AI, he led initiatives on generative AI (e.g., Fiddler Auditor, an open-source library for evaluating & red-teaming LLMs before deployment; AI safety, observability & feedback mechanisms for LLMs in production), and on AI safety, alignment, observability, and trustworthiness, as well as the technical strategy, innovation, and thought leadership for Fiddler. Prior to that, he was a Principal Scientist at Amazon AWS AI, where he led the fairness, explainability, privacy, and model understanding initiatives in the Amazon AI platform, and shaped new initiatives such as Amazon SageMaker Clarify from inception to launch. Prior to joining Amazon, he led similar efforts at the LinkedIn AI team, and served as LinkedIn’s representative in Microsoft’s AI and Ethics in Engineering and Research (AETHER) Advisory Board. Previously, he was a Researcher at Microsoft Research Silicon Valley Lab. Krishnaram received his Ph.D. in Computer Science from Stanford University in 2006. He serves regularly on the senior program committees of FAccT, KDD, WWW, WSDM, and related conferences, and co-chaired the 2014 ACM Symposium on Computing for Development. His work has been recognized through awards at NAACL, WWW, SODA, CIKM, ICML AutoML workshop, and Microsoft’s AI/ML conference (MLADS). He has published 60+ papers, with 7000+ citations and filed 150+ patents (72 granted). He has presented tutorials on trustworthy generative AI, privacy, fairness, explainable AI, model monitoring, and responsible AI at forums such as ICML, KDD, WSDM, WWW, FAccT, and AAAI, given several invited industry talks, and instructed a course on responsible AI at Stanford.

- Keynotes
.webp)
- Keynotes

- Keynotes
As enterprises increasingly leverage vast public and private datasets, generative AI and agentic systems are transforming the landscape of AI-driven solutions. These systems demand unparalleled scalability, speed, and efficiency to process massive data volumes while autonomously orchestrating complex workflows. SambaNova Systems offers its revolutionary memory-centric design, engineered to power trillion-parameter models and multi-agent systems with record-breaking interactive inference performa
...Featured Keynote Speakers

Naveen Rao
VP of AI
Databricks

Denis Yarats
Co-Founder & CTO
Perplexity

Aaron Katz
Co-Founder & CEO
Clickhouse

Martin Casado
General Partner
a16z

Sharon Zhou
Founder & CEO
Lamini

Michele Catasta
President
Replit

Jake Brill
Head of Product - Integrity
OpenAI
.jpeg)
Rachad Alao
Senior Engineering Director
Meta

Julien Le Dem
Principal Engineer
Datadog

Joseph Gonzalez
Professor
RunLLM & UC Berkeley

Krishnaram Kenthapadi
Chief Scientist, Clinical AI
Oracle Health

George Mathew
Managing Director
Insight Partners
.webp)
Daniel Olmedilla
Distinguished Engineer, AI & Trust

Sumti Jairath
Chief Architect
SambaNova Systems
100+ Speakers
Learn from data & AI heroes at top companies as they explain their architectures, discoveries and solutions in detail.

- Analytics & BI
Lloyd Tabb, a tech pioneer, revolutionized internet and data usage over 30 years. After working as Borland's database architect, he was Principal Engineer at Netscape during the browser wars and helped found Mozilla.org. He later founded Looker, acquired by Google in 2019, which helped define the Modern Data Stack. Now at Meta, he leads Malloy, an experimental language that reimagines SQL with coding libraries, recently transferred to the Linux Foundation. The project has expanded to support Presto, Trino, Snowflake, and other SQL dialects, while adding features like parameterized sources and visual query builders.

- Data Eng & Infrastructure

- Foundation Models

- Keynotes

- Foundation Models

- Keynotes

- Data Sci & Algos

- Keynotes
Join us for an authentic fireside chat that cuts through industry hype to explore how real-time data infrastructure is transforming analytics and AI. We'll examine technical hurdles in processing data at scale in real-time and the architectural decisions enabling performance breakthroughs. The discussion covers data processing evolution, performance bottlenecks in distributed systems, observability innovations, and approaches for maintaining consistency while increasing throughput. Gain insights
...Join us for an authentic fireside chat that cuts through industry hype to explore how real-time data infrastructure is transforming analytics and AI. We'll examine technical hurdles in processing data at scale in real-time and the architectural decisions enabling performance breakthroughs. The discussion covers data processing evolution, performance bottlenecks in distributed systems, observability innovations, and approaches for maintaining consistency while increasing throughput. Gain insights
...
- Workshops

- Data Eng & Infrastructure
Ryan Blue is the original creator of Apache Iceberg and currently serves as Vice President of the project. With over a decade in data engineering, he's an established expert in big data formats and infrastructure. Currently a Member of Technical Staff at Databricks since June 2024, Ryan previously co-founded and served as CEO of Tabular until its acquisition by Databricks. His career includes senior positions at Netflix and Cloudera, where he was a technical lead for data formats. An Apache Software Foundation member since 2017, he's a committer in the Apache Parquet, Avro, and Spark communities and previously served as VP of Apache Avro. Ryan holds dual Bachelor's degrees in Mathematics and Computer Science from the University of Idaho and a Master's in Computer Science from the University of Maryland.

- Analytics & BI

- Keynotes
Join us for an authentic fireside chat that cuts through industry hype to explore how real-time data infrastructure is transforming analytics and AI. We'll examine technical hurdles in processing data at scale in real-time and the architectural decisions enabling performance breakthroughs. The discussion covers data processing evolution, performance bottlenecks in distributed systems, observability innovations, and approaches for maintaining consistency while increasing throughput. Gain insights
...Join us for an authentic fireside chat that cuts through industry hype to explore how real-time data infrastructure is transforming analytics and AI. We'll examine technical hurdles in processing data at scale in real-time and the architectural decisions enabling performance breakthroughs. The discussion covers data processing evolution, performance bottlenecks in distributed systems, observability innovations, and approaches for maintaining consistency while increasing throughput. Gain insights
...
- Databases

- Foundation Models
Ethan Rosenthal is a Member of Technical Staff at Runway, an applied AI research company focused on multimedia content creation, where he builds engineering systems to accelerate the work of research scientists. His career spans diverse roles across AI, machine learning, and data science - from training language models at Square to developing recommendation systems at seed-stage ecommerce startups. Before working in tech, Ethan was an actual scientist and got his PhD in experimental physics from Columbia University.

- AI Engineering

- AI Engineering

- AI Engineering

- AI Engineering

- Keynotes

- Analytics & BI
Mike has spent over two decades as a technologist, entrepreneur, and investor. He’s currently co-founder and CEO of Rill, a cloud service for operational intelligence. Previously he founded Metamarkets (acquired by Snap, Inc. in 2017), a real-time analytics platform for digital ad firms, and CustomInk.com a leader in custom apparel online. Mike was also a founding partner at the venture capital firm DCVC, which has invested over $2B+ in assets in deep tech. He began his career as a software engineer for the Human Genome Project and later received a Ph.D. in computational biology.

- Data Sci & Algos

- Data Eng & Infrastructure

- Keynotes

- Keynotes
.jpeg)
- Keynotes

- Keynotes
Julien Le Dem is a Principal Engineer at Datadog, serves as an officer of the ASF and is a member of the LFAI&Data Technical Advisory Council. He co-created the Parquet, Arrow and OpenLineage open source projects and is involved in several others. His career leadership began in Data Platforms at Yahoo! - where he received his Hadoop initiation - then continued at Twitter, Dremio and WeWork. He then co-founded Datakin (acquired by Astronomer) to solve Data Observability. His French accent makes his talks particularly attractive.

- GenAI Applications
- Keynotes

- Keynotes
Krishnaram Kenthapadi is the Chief Scientist, Clinical AI at Oracle Health, where he leads the AI initiatives for Clinical Digital Assistant and other Oracle Health products. Previously, as the Chief AI Officer & Chief Scientist of Fiddler AI, he led initiatives on generative AI (e.g., Fiddler Auditor, an open-source library for evaluating & red-teaming LLMs before deployment; AI safety, observability & feedback mechanisms for LLMs in production), and on AI safety, alignment, observability, and trustworthiness, as well as the technical strategy, innovation, and thought leadership for Fiddler. Prior to that, he was a Principal Scientist at Amazon AWS AI, where he led the fairness, explainability, privacy, and model understanding initiatives in the Amazon AI platform, and shaped new initiatives such as Amazon SageMaker Clarify from inception to launch. Prior to joining Amazon, he led similar efforts at the LinkedIn AI team, and served as LinkedIn’s representative in Microsoft’s AI and Ethics in Engineering and Research (AETHER) Advisory Board. Previously, he was a Researcher at Microsoft Research Silicon Valley Lab. Krishnaram received his Ph.D. in Computer Science from Stanford University in 2006. He serves regularly on the senior program committees of FAccT, KDD, WWW, WSDM, and related conferences, and co-chaired the 2014 ACM Symposium on Computing for Development. His work has been recognized through awards at NAACL, WWW, SODA, CIKM, ICML AutoML workshop, and Microsoft’s AI/ML conference (MLADS). He has published 60+ papers, with 7000+ citations and filed 150+ patents (72 granted). He has presented tutorials on trustworthy generative AI, privacy, fairness, explainable AI, model monitoring, and responsible AI at forums such as ICML, KDD, WSDM, WWW, FAccT, and AAAI, given several invited industry talks, and instructed a course on responsible AI at Stanford.

- Keynotes
.webp)
- Keynotes

- Keynotes
As enterprises increasingly leverage vast public and private datasets, generative AI and agentic systems are transforming the landscape of AI-driven solutions. These systems demand unparalleled scalability, speed, and efficiency to process massive data volumes while autonomously orchestrating complex workflows. SambaNova Systems offers its revolutionary memory-centric design, engineered to power trillion-parameter models and multi-agent systems with record-breaking interactive inference performa
...As enterprises increasingly leverage vast public and private datasets, generative AI and agentic systems are transforming the landscape of AI-driven solutions. These systems demand unparalleled scalability, speed, and efficiency to process massive data volumes while autonomously orchestrating complex workflows. SambaNova Systems offers its revolutionary memory-centric design, engineered to power trillion-parameter models and multi-agent systems with record-breaking interactive inference performa
...
- Foundation Models
Han-chung Lee is a machine learning expert who builds and operates AI systems with a focus on GenAI, LLM agents, and recommendation engines. Currently Director of Machine Learning at Moody's Analytics and founder of Calabazas Creek, Han-chung excels at untangling complex code and organizations. A Berkeley EECS grad with an MBA from SJSU, he shares insights on ML engineering and tech investing drawn from his diverse experience across sell-side, buy-side, and venture capital. Follow his practical wisdom and occasional industry reflections.

- Analytics & BI
Julian Hyde is the original developer of Apache Calcite, which provides SQL parsers and query optimizers for dozens of products, and Morel, a new functional query language. Previously he led the query processing team at Looker (acquired by Google in 2020), and co-founded SQLstream, an engine for continuous queries. He left Google in early 2025 to create the next language for data.

- Lightning Talks

- Workshops

- Foundation Models

- Lightning Talks

- Lightning Talks
Arvind Prabhakar is the co-founder and CEO of Tabsdata, a company pioneering a new approach to self-serve data integration through Pub/Sub for Tables. Prior to this, he was the co-founder and Chief Product Officer of StreamSets, a leading data integration platform now part of IBM. An open source enthusiast and seasoned entrepreneur, Arvind has spent his career building systems that simplify and scale enterprise data management. If you have questions or thoughts on the future of data integration, feel free to reach out to him directly at arvind@tabsdata.com.

- Databases

- Foundation Models

- GenAI Applications

- AI & Data Culture

- Workshops

- AI & Data Culture

- Databases

- Workshops

- Foundation Models

- Databases

- AI Engineering

- Lightning Talks

- Workshops
Modern data lakes promise affordability and scalability, but using them can be a headache. Cloud data warehouses make querying easy, but they come with a hefty price tag and extra complexity. What if you could get the same ease of use without the cost and lock-in?
In this session, we’ll show you how to leverage open-source software to build a fully functional, queryable analytics powerhouse using DuckDB, Fivetran, and Polaris Catalog. We’ll walk through how to:
1. Load data that is automatically
...Modern data lakes promise affordability and scalability, but using them can be a headache. Cloud data warehouses make querying easy, but they come with a hefty price tag and extra complexity. What if you could get the same ease of use without the cost and lock-in?
In this session, we’ll show you how to leverage open-source software to build a fully functional, queryable analytics powerhouse using DuckDB, Fivetran, and Polaris Catalog. We’ll walk through how to:
1. Load data that is automatically
...George Fraser is the CEO and co-founder of Fivetran. George founded the company with COO Taylor Brown in 2012 after completing the prestigious Y-Combinator accelerator program. Since then he has built Fivetran, a fully managed automated data integration provider, from an idea to a rapidly growing global business valued at $5.6 billion, supported by a global team of 1000+ employees.

- Data Eng & Infrastructure

- Workshops

- GenAI Applications

- Workshops

- Databases

- AI Engineering

- Data Sci & Algos

- Workshops
Ori Soen is a serial entrepreneur and current Founder/CEO of Montara Inc., creating the first unified DataOps platform for data development. With a track record of building, scaling, and successfully exiting multiple startups, Ori previously served as EVP at Medallia (NYSE: MDLA), where he helped take the company public. At Medallia, he led the Digital Products Business Unit, mid-market sales, served as CMO, and joined through the acquisition of Kampyle, where as CEO he built the company into a market leader. Earlier, he was CMO and Head of Product at Jajah (acquired by Telefonica).

- Databases

- Workshops
Modern data workloads demand fast, interactive, and scalable visualization—without the cost and complexity of server-side rendering. The local-first approach leverages modern browser capabilities, WebAssembly, and in-browser computation to achieve high-performance analytics while reducing cloud costs.
In this workshop, we’ll explore:
1. Why Local-First? The benefits of running everything client-side for cost-efficient, scalable visualization across thousands of users.WebAssembly (WASM) for Data
...Modern data workloads demand fast, interactive, and scalable visualization—without the cost and complexity of server-side rendering. The local-first approach leverages modern browser capabilities, WebAssembly, and in-browser computation to achieve high-performance analytics while reducing cloud costs.
In this workshop, we’ll explore:
1. Why Local-First? The benefits of running everything client-side for cost-efficient, scalable visualization across thousands of users.WebAssembly (WASM) for Data
...
- Data Sci & Algos

- AI & Data Culture

- AI & Data Culture

- Lightning Talks
Not all AI agent use cases are created equal. While code generation agents can be tested against clear benchmarks, operational agents tackling real-world problems face a fundamentally different challenge: how do you evaluate an agent that must navigate complex, dynamic systems without a predefined playbook? Take root cause analysis in distributed systems: an agent must understand intricate service dependencies, parse through inconsistent logs, and reason about potential failure modes. Unlike cod
...Not all AI agent use cases are created equal. While code generation agents can be tested against clear benchmarks, operational agents tackling real-world problems face a fundamentally different challenge: how do you evaluate an agent that must navigate complex, dynamic systems without a predefined playbook? Take root cause analysis in distributed systems: an agent must understand intricate service dependencies, parse through inconsistent logs, and reason about potential failure modes. Unlike cod
...
- GenAI Applications
Mitul Tiwari is CTO and Co-founder of a stealth AI startup. Until recently he was a Director of AI and Machine Learning Engineering at ServiceNow leading natural language technologies group. Earlier he was CTO and Co-founder of Passage AI (acquired by Servicenow). His expertise lies in building data-driven products using AI, Machine Learning and big data technologies. Previously he was head of People You May Know and Growth Relevance at LinkedIn, where he led technical innovations in large-scale social recommender systems. Prior to that, he worked at Kosmix (now Walmart Labs) on web-scale text categorization, and its applications. He earned his PhD in Computer Science from the University of Texas at Austin and his undergraduate degree from the Indian Institute of Technology, Bombay. He has also co-authored more than twenty publications in top conferences such as ACL, AAAI, KDD, WWW, RecSys, VLDB, SIGIR, CIKM, and SPAA.

- Data Sci & Algos

- GenAI Applications

- Workshops

- AI & Data Culture

- ML OPs & Platforms

- Lightning Talks

- AI Engineering

- Data Eng & Infrastructure

- Data Eng & Infrastructure

- Data Eng & Infrastructure

- Data Eng & Infrastructure

- Data Sci & Algos
Ciro Greco is the Founder and CEO of bauplan, a zero-copy, data-first FaaS platform launched in 2023. He holds a Ph.D. in Experimental Psychology, Linguistics and Neuroscience from the University of Milano-Bicocca. Previously, he co-founded Tooso, an AI-powered commerce search company acquired by Coveo in 2019, where he later served as VP of Artificial Intelligence. His expertise spans AI, linguistics, and data engineering, with a focus on making cloud data pipelines more efficient.

- Data Sci & Algos

- ML OPs & Platforms
Marcel Kornacker is currently CTO and Co-Founder at Pixeltable and is notably known as the founder of Apache Impala and co-founder of Apache Parquet. He holds a Ph.D. in Computer Science from UC Berkeley, where he studied databases under Joe Hellerstein. His career includes founding several startups, including Blink Computing, which provided data lake analytics as a service. He has also served as an Entrepreneur in Residence at both Sutter Hill Ventures and Coatue Management. Kornacker's expertise spans databases, data analytics, and open-source technologies.

- Lightning Talks

- ML OPs & Platforms

- Data Eng & Infrastructure

- Analytics & BI

- ML OPs & Platforms

- ML OPs & Platforms

- GenAI Applications

- Lightning Talks

- Lightning Talks

- Lightning Talks

- Lightning Talks
Michael is a globally recognized leader in AI and data-driven business transformation, known for developing AIOS, an intelligent marketing system at Plus that optimizes customer journey analysis and marketing ROI. A former NYU Stern marketing professor, his research spans predictive computation, marketing strategies, and consumer behavior. He has launched successful AI products focused on user safety, privacy, and marketing optimization. His work on explainable prediction and data synthesis, particularly with incomplete data, has been featured in major publications like LA Times and AdAge, and he frequently speaks at prestigious events including The Nantucket Project and ANA Masters of Marketing.

- Workshops

- Workshops

- Workshops

- AI & Data Culture
At the foundation of AI project failures lies a critical gap between data teams and business reality. On top of this gap, data quality issues, unexpected privacy concerns, and tools that don't align with actual business problems arise to hinder or block implementation. As we've built our own AI product—AI Decisioning—and implemented it with customers, we've learned that successful AI implementations depend on embedding data teams within business units. Embedding doesn't mean breaking apart your
...At the foundation of AI project failures lies a critical gap between data teams and business reality. On top of this gap, data quality issues, unexpected privacy concerns, and tools that don't align with actual business problems arise to hinder or block implementation. As we've built our own AI product—AI Decisioning—and implemented it with customers, we've learned that successful AI implementations depend on embedding data teams within business units. Embedding doesn't mean breaking apart your
...
- Lightning Talks
Building Cost-Effective LLM Routers: Boost Accuracy 25% While Cutting Costs 90% | This session reveals how to build intelligent model routers that dynamically direct inputs to the optimal large language model (LLM) for each specific task. Attendees will learn practical implementation strategies for multi-model LLM systems that significantly improve performance metrics—achieving up to 25% higher accuracy while reducing operational costs by as much as 90%. The presentation covers essential routing
...Building Cost-Effective LLM Routers: Boost Accuracy 25% While Cutting Costs 90% | This session reveals how to build intelligent model routers that dynamically direct inputs to the optimal large language model (LLM) for each specific task. Attendees will learn practical implementation strategies for multi-model LLM systems that significantly improve performance metrics—achieving up to 25% higher accuracy while reducing operational costs by as much as 90%. The presentation covers essential routing
...
- Data Eng & Infrastructure

- Data Eng & Infrastructure

- Analytics & BI

- Lightning Talks

- ML OPs & Platforms

- Workshops

- Workshops

- Workshops

- Workshops

- Lightning Talks

- Lightning Talks
The first two years of the GenAI revolution are bending the OSS way: Open Source models have reached state of the art, and most of the ecosystem around AI is open-source. The key to AI adoption is properly organizing and using business knowledge. In industry, LLMs give way to Small Specialized Models (SSMs), utilized by Domain Expert Agents (DXAs). Their work should be structured according to the domain requirements, requiring structured output. Organizing and using domain knowledge for AI has l
...The first two years of the GenAI revolution are bending the OSS way: Open Source models have reached state of the art, and most of the ecosystem around AI is open-source. The key to AI adoption is properly organizing and using business knowledge. In industry, LLMs give way to Small Specialized Models (SSMs), utilized by Domain Expert Agents (DXAs). Their work should be structured according to the domain requirements, requiring structured output. Organizing and using domain knowledge for AI has l
...
- Data Eng & Infrastructure
Building High-Throughput Data Orchestration: Instacart's Journey to 20M Daily Workflows | Explore how Instacart built an enterprise-grade orchestration system handling 20 million daily workflows across diverse technical domains. Learn implementation details of their cloud-native platform combining Apache Airflow and Temporal for robust scheduling and execution. Deep dive into YAML-based workflow definitions, GitOps deployment patterns, and observability solutions that enable reliable scaling. Pr
...Building High-Throughput Data Orchestration: Instacart's Journey to 20M Daily Workflows | Explore how Instacart built an enterprise-grade orchestration system handling 20 million daily workflows across diverse technical domains. Learn implementation details of their cloud-native platform combining Apache Airflow and Temporal for robust scheduling and execution. Deep dive into YAML-based workflow definitions, GitOps deployment patterns, and observability solutions that enable reliable scaling. Pr
...
- GenAI Applications

- Lightning Talks

- AI & Data Culture

- Databases

- Lightning Talks

- Workshops
100+ Speakers
Learn from data & AI heroes at top companies as they explain their architectures, discoveries and solutions in detail.

- Analytics & BI
Lloyd Tabb, a tech pioneer, revolutionized internet and data usage over 30 years. After working as Borland's database architect, he was Principal Engineer at Netscape during the browser wars and helped found Mozilla.org. He later founded Looker, acquired by Google in 2019, which helped define the Modern Data Stack. Now at Meta, he leads Malloy, an experimental language that reimagines SQL with coding libraries, recently transferred to the Linux Foundation. The project has expanded to support Presto, Trino, Snowflake, and other SQL dialects, while adding features like parameterized sources and visual query builders.

- Data Eng & Infrastructure

- Foundation Models

- Keynotes

- Foundation Models

- Keynotes

- Data Sci & Algos

- Keynotes
Join us for an authentic fireside chat that cuts through industry hype to explore how real-time data infrastructure is transforming analytics and AI. We'll examine technical hurdles in processing data at scale in real-time and the architectural decisions enabling performance breakthroughs. The discussion covers data processing evolution, performance bottlenecks in distributed systems, observability innovations, and approaches for maintaining consistency while increasing throughput. Gain insights
...Join us for an authentic fireside chat that cuts through industry hype to explore how real-time data infrastructure is transforming analytics and AI. We'll examine technical hurdles in processing data at scale in real-time and the architectural decisions enabling performance breakthroughs. The discussion covers data processing evolution, performance bottlenecks in distributed systems, observability innovations, and approaches for maintaining consistency while increasing throughput. Gain insights
...
- Workshops

- Data Eng & Infrastructure
Ryan Blue is the original creator of Apache Iceberg and currently serves as Vice President of the project. With over a decade in data engineering, he's an established expert in big data formats and infrastructure. Currently a Member of Technical Staff at Databricks since June 2024, Ryan previously co-founded and served as CEO of Tabular until its acquisition by Databricks. His career includes senior positions at Netflix and Cloudera, where he was a technical lead for data formats. An Apache Software Foundation member since 2017, he's a committer in the Apache Parquet, Avro, and Spark communities and previously served as VP of Apache Avro. Ryan holds dual Bachelor's degrees in Mathematics and Computer Science from the University of Idaho and a Master's in Computer Science from the University of Maryland.

- Analytics & BI

- Keynotes
Join us for an authentic fireside chat that cuts through industry hype to explore how real-time data infrastructure is transforming analytics and AI. We'll examine technical hurdles in processing data at scale in real-time and the architectural decisions enabling performance breakthroughs. The discussion covers data processing evolution, performance bottlenecks in distributed systems, observability innovations, and approaches for maintaining consistency while increasing throughput. Gain insights
...Join us for an authentic fireside chat that cuts through industry hype to explore how real-time data infrastructure is transforming analytics and AI. We'll examine technical hurdles in processing data at scale in real-time and the architectural decisions enabling performance breakthroughs. The discussion covers data processing evolution, performance bottlenecks in distributed systems, observability innovations, and approaches for maintaining consistency while increasing throughput. Gain insights
...
- Databases

- Foundation Models
Ethan Rosenthal is a Member of Technical Staff at Runway, an applied AI research company focused on multimedia content creation, where he builds engineering systems to accelerate the work of research scientists. His career spans diverse roles across AI, machine learning, and data science - from training language models at Square to developing recommendation systems at seed-stage ecommerce startups. Before working in tech, Ethan was an actual scientist and got his PhD in experimental physics from Columbia University.

- AI Engineering

- AI Engineering

- AI Engineering

- AI Engineering

- Keynotes

- Analytics & BI
Mike has spent over two decades as a technologist, entrepreneur, and investor. He’s currently co-founder and CEO of Rill, a cloud service for operational intelligence. Previously he founded Metamarkets (acquired by Snap, Inc. in 2017), a real-time analytics platform for digital ad firms, and CustomInk.com a leader in custom apparel online. Mike was also a founding partner at the venture capital firm DCVC, which has invested over $2B+ in assets in deep tech. He began his career as a software engineer for the Human Genome Project and later received a Ph.D. in computational biology.

- Data Sci & Algos

- Data Eng & Infrastructure

- Keynotes

- Keynotes
.jpeg)
- Keynotes

- Keynotes
Julien Le Dem is a Principal Engineer at Datadog, serves as an officer of the ASF and is a member of the LFAI&Data Technical Advisory Council. He co-created the Parquet, Arrow and OpenLineage open source projects and is involved in several others. His career leadership began in Data Platforms at Yahoo! - where he received his Hadoop initiation - then continued at Twitter, Dremio and WeWork. He then co-founded Datakin (acquired by Astronomer) to solve Data Observability. His French accent makes his talks particularly attractive.

- GenAI Applications
- Keynotes

- Keynotes
Krishnaram Kenthapadi is the Chief Scientist, Clinical AI at Oracle Health, where he leads the AI initiatives for Clinical Digital Assistant and other Oracle Health products. Previously, as the Chief AI Officer & Chief Scientist of Fiddler AI, he led initiatives on generative AI (e.g., Fiddler Auditor, an open-source library for evaluating & red-teaming LLMs before deployment; AI safety, observability & feedback mechanisms for LLMs in production), and on AI safety, alignment, observability, and trustworthiness, as well as the technical strategy, innovation, and thought leadership for Fiddler. Prior to that, he was a Principal Scientist at Amazon AWS AI, where he led the fairness, explainability, privacy, and model understanding initiatives in the Amazon AI platform, and shaped new initiatives such as Amazon SageMaker Clarify from inception to launch. Prior to joining Amazon, he led similar efforts at the LinkedIn AI team, and served as LinkedIn’s representative in Microsoft’s AI and Ethics in Engineering and Research (AETHER) Advisory Board. Previously, he was a Researcher at Microsoft Research Silicon Valley Lab. Krishnaram received his Ph.D. in Computer Science from Stanford University in 2006. He serves regularly on the senior program committees of FAccT, KDD, WWW, WSDM, and related conferences, and co-chaired the 2014 ACM Symposium on Computing for Development. His work has been recognized through awards at NAACL, WWW, SODA, CIKM, ICML AutoML workshop, and Microsoft’s AI/ML conference (MLADS). He has published 60+ papers, with 7000+ citations and filed 150+ patents (72 granted). He has presented tutorials on trustworthy generative AI, privacy, fairness, explainable AI, model monitoring, and responsible AI at forums such as ICML, KDD, WSDM, WWW, FAccT, and AAAI, given several invited industry talks, and instructed a course on responsible AI at Stanford.

- Keynotes
.webp)
- Keynotes

- Keynotes
As enterprises increasingly leverage vast public and private datasets, generative AI and agentic systems are transforming the landscape of AI-driven solutions. These systems demand unparalleled scalability, speed, and efficiency to process massive data volumes while autonomously orchestrating complex workflows. SambaNova Systems offers its revolutionary memory-centric design, engineered to power trillion-parameter models and multi-agent systems with record-breaking interactive inference performa
...As enterprises increasingly leverage vast public and private datasets, generative AI and agentic systems are transforming the landscape of AI-driven solutions. These systems demand unparalleled scalability, speed, and efficiency to process massive data volumes while autonomously orchestrating complex workflows. SambaNova Systems offers its revolutionary memory-centric design, engineered to power trillion-parameter models and multi-agent systems with record-breaking interactive inference performa
...
- Foundation Models
Han-chung Lee is a machine learning expert who builds and operates AI systems with a focus on GenAI, LLM agents, and recommendation engines. Currently Director of Machine Learning at Moody's Analytics and founder of Calabazas Creek, Han-chung excels at untangling complex code and organizations. A Berkeley EECS grad with an MBA from SJSU, he shares insights on ML engineering and tech investing drawn from his diverse experience across sell-side, buy-side, and venture capital. Follow his practical wisdom and occasional industry reflections.

- Analytics & BI
Julian Hyde is the original developer of Apache Calcite, which provides SQL parsers and query optimizers for dozens of products, and Morel, a new functional query language. Previously he led the query processing team at Looker (acquired by Google in 2020), and co-founded SQLstream, an engine for continuous queries. He left Google in early 2025 to create the next language for data.

- Lightning Talks

- Workshops

- Foundation Models

- Lightning Talks

- Lightning Talks
Arvind Prabhakar is the co-founder and CEO of Tabsdata, a company pioneering a new approach to self-serve data integration through Pub/Sub for Tables. Prior to this, he was the co-founder and Chief Product Officer of StreamSets, a leading data integration platform now part of IBM. An open source enthusiast and seasoned entrepreneur, Arvind has spent his career building systems that simplify and scale enterprise data management. If you have questions or thoughts on the future of data integration, feel free to reach out to him directly at arvind@tabsdata.com.

- Databases

- Foundation Models

- GenAI Applications

- AI & Data Culture

- Workshops

- AI & Data Culture

- Databases

- Workshops

- Foundation Models

- Databases

- AI Engineering

- Lightning Talks

- Workshops
Modern data lakes promise affordability and scalability, but using them can be a headache. Cloud data warehouses make querying easy, but they come with a hefty price tag and extra complexity. What if you could get the same ease of use without the cost and lock-in?
In this session, we’ll show you how to leverage open-source software to build a fully functional, queryable analytics powerhouse using DuckDB, Fivetran, and Polaris Catalog. We’ll walk through how to:
1. Load data that is automatically
...Modern data lakes promise affordability and scalability, but using them can be a headache. Cloud data warehouses make querying easy, but they come with a hefty price tag and extra complexity. What if you could get the same ease of use without the cost and lock-in?
In this session, we’ll show you how to leverage open-source software to build a fully functional, queryable analytics powerhouse using DuckDB, Fivetran, and Polaris Catalog. We’ll walk through how to:
1. Load data that is automatically
...George Fraser is the CEO and co-founder of Fivetran. George founded the company with COO Taylor Brown in 2012 after completing the prestigious Y-Combinator accelerator program. Since then he has built Fivetran, a fully managed automated data integration provider, from an idea to a rapidly growing global business valued at $5.6 billion, supported by a global team of 1000+ employees.

- Data Eng & Infrastructure

- Workshops

- GenAI Applications

- Workshops

- Databases

- AI Engineering

- Data Sci & Algos

- Workshops
Ori Soen is a serial entrepreneur and current Founder/CEO of Montara Inc., creating the first unified DataOps platform for data development. With a track record of building, scaling, and successfully exiting multiple startups, Ori previously served as EVP at Medallia (NYSE: MDLA), where he helped take the company public. At Medallia, he led the Digital Products Business Unit, mid-market sales, served as CMO, and joined through the acquisition of Kampyle, where as CEO he built the company into a market leader. Earlier, he was CMO and Head of Product at Jajah (acquired by Telefonica).

- Databases

- Workshops
Modern data workloads demand fast, interactive, and scalable visualization—without the cost and complexity of server-side rendering. The local-first approach leverages modern browser capabilities, WebAssembly, and in-browser computation to achieve high-performance analytics while reducing cloud costs.
In this workshop, we’ll explore:
1. Why Local-First? The benefits of running everything client-side for cost-efficient, scalable visualization across thousands of users.WebAssembly (WASM) for Data
...Modern data workloads demand fast, interactive, and scalable visualization—without the cost and complexity of server-side rendering. The local-first approach leverages modern browser capabilities, WebAssembly, and in-browser computation to achieve high-performance analytics while reducing cloud costs.
In this workshop, we’ll explore:
1. Why Local-First? The benefits of running everything client-side for cost-efficient, scalable visualization across thousands of users.WebAssembly (WASM) for Data
...
- Data Sci & Algos

- AI & Data Culture

- AI & Data Culture

- Lightning Talks
Not all AI agent use cases are created equal. While code generation agents can be tested against clear benchmarks, operational agents tackling real-world problems face a fundamentally different challenge: how do you evaluate an agent that must navigate complex, dynamic systems without a predefined playbook? Take root cause analysis in distributed systems: an agent must understand intricate service dependencies, parse through inconsistent logs, and reason about potential failure modes. Unlike cod
...Not all AI agent use cases are created equal. While code generation agents can be tested against clear benchmarks, operational agents tackling real-world problems face a fundamentally different challenge: how do you evaluate an agent that must navigate complex, dynamic systems without a predefined playbook? Take root cause analysis in distributed systems: an agent must understand intricate service dependencies, parse through inconsistent logs, and reason about potential failure modes. Unlike cod
...
- GenAI Applications
Mitul Tiwari is CTO and Co-founder of a stealth AI startup. Until recently he was a Director of AI and Machine Learning Engineering at ServiceNow leading natural language technologies group. Earlier he was CTO and Co-founder of Passage AI (acquired by Servicenow). His expertise lies in building data-driven products using AI, Machine Learning and big data technologies. Previously he was head of People You May Know and Growth Relevance at LinkedIn, where he led technical innovations in large-scale social recommender systems. Prior to that, he worked at Kosmix (now Walmart Labs) on web-scale text categorization, and its applications. He earned his PhD in Computer Science from the University of Texas at Austin and his undergraduate degree from the Indian Institute of Technology, Bombay. He has also co-authored more than twenty publications in top conferences such as ACL, AAAI, KDD, WWW, RecSys, VLDB, SIGIR, CIKM, and SPAA.

- Data Sci & Algos

- GenAI Applications

- Workshops

- AI & Data Culture

- ML OPs & Platforms

- Lightning Talks

- AI Engineering

- Data Eng & Infrastructure

- Data Eng & Infrastructure

- Data Eng & Infrastructure

- Data Eng & Infrastructure

- Data Sci & Algos
Ciro Greco is the Founder and CEO of bauplan, a zero-copy, data-first FaaS platform launched in 2023. He holds a Ph.D. in Experimental Psychology, Linguistics and Neuroscience from the University of Milano-Bicocca. Previously, he co-founded Tooso, an AI-powered commerce search company acquired by Coveo in 2019, where he later served as VP of Artificial Intelligence. His expertise spans AI, linguistics, and data engineering, with a focus on making cloud data pipelines more efficient.

- Data Sci & Algos

- ML OPs & Platforms
Marcel Kornacker is currently CTO and Co-Founder at Pixeltable and is notably known as the founder of Apache Impala and co-founder of Apache Parquet. He holds a Ph.D. in Computer Science from UC Berkeley, where he studied databases under Joe Hellerstein. His career includes founding several startups, including Blink Computing, which provided data lake analytics as a service. He has also served as an Entrepreneur in Residence at both Sutter Hill Ventures and Coatue Management. Kornacker's expertise spans databases, data analytics, and open-source technologies.

- Lightning Talks

- ML OPs & Platforms

- Data Eng & Infrastructure

- Analytics & BI

- ML OPs & Platforms

- ML OPs & Platforms

- GenAI Applications

- Lightning Talks

- Lightning Talks

- Lightning Talks

- Lightning Talks
Michael is a globally recognized leader in AI and data-driven business transformation, known for developing AIOS, an intelligent marketing system at Plus that optimizes customer journey analysis and marketing ROI. A former NYU Stern marketing professor, his research spans predictive computation, marketing strategies, and consumer behavior. He has launched successful AI products focused on user safety, privacy, and marketing optimization. His work on explainable prediction and data synthesis, particularly with incomplete data, has been featured in major publications like LA Times and AdAge, and he frequently speaks at prestigious events including The Nantucket Project and ANA Masters of Marketing.

- Workshops

- Workshops

- Workshops

- AI & Data Culture
At the foundation of AI project failures lies a critical gap between data teams and business reality. On top of this gap, data quality issues, unexpected privacy concerns, and tools that don't align with actual business problems arise to hinder or block implementation. As we've built our own AI product—AI Decisioning—and implemented it with customers, we've learned that successful AI implementations depend on embedding data teams within business units. Embedding doesn't mean breaking apart your
...At the foundation of AI project failures lies a critical gap between data teams and business reality. On top of this gap, data quality issues, unexpected privacy concerns, and tools that don't align with actual business problems arise to hinder or block implementation. As we've built our own AI product—AI Decisioning—and implemented it with customers, we've learned that successful AI implementations depend on embedding data teams within business units. Embedding doesn't mean breaking apart your
...
- Lightning Talks
Building Cost-Effective LLM Routers: Boost Accuracy 25% While Cutting Costs 90% | This session reveals how to build intelligent model routers that dynamically direct inputs to the optimal large language model (LLM) for each specific task. Attendees will learn practical implementation strategies for multi-model LLM systems that significantly improve performance metrics—achieving up to 25% higher accuracy while reducing operational costs by as much as 90%. The presentation covers essential routing
...Building Cost-Effective LLM Routers: Boost Accuracy 25% While Cutting Costs 90% | This session reveals how to build intelligent model routers that dynamically direct inputs to the optimal large language model (LLM) for each specific task. Attendees will learn practical implementation strategies for multi-model LLM systems that significantly improve performance metrics—achieving up to 25% higher accuracy while reducing operational costs by as much as 90%. The presentation covers essential routing
...
- Data Eng & Infrastructure

- Data Eng & Infrastructure

- Analytics & BI

- Lightning Talks

- ML OPs & Platforms

- Workshops

- Workshops

- Workshops

- Workshops

- Lightning Talks

- Lightning Talks
The first two years of the GenAI revolution are bending the OSS way: Open Source models have reached state of the art, and most of the ecosystem around AI is open-source. The key to AI adoption is properly organizing and using business knowledge. In industry, LLMs give way to Small Specialized Models (SSMs), utilized by Domain Expert Agents (DXAs). Their work should be structured according to the domain requirements, requiring structured output. Organizing and using domain knowledge for AI has l
...The first two years of the GenAI revolution are bending the OSS way: Open Source models have reached state of the art, and most of the ecosystem around AI is open-source. The key to AI adoption is properly organizing and using business knowledge. In industry, LLMs give way to Small Specialized Models (SSMs), utilized by Domain Expert Agents (DXAs). Their work should be structured according to the domain requirements, requiring structured output. Organizing and using domain knowledge for AI has l
...
- Data Eng & Infrastructure
Building High-Throughput Data Orchestration: Instacart's Journey to 20M Daily Workflows | Explore how Instacart built an enterprise-grade orchestration system handling 20 million daily workflows across diverse technical domains. Learn implementation details of their cloud-native platform combining Apache Airflow and Temporal for robust scheduling and execution. Deep dive into YAML-based workflow definitions, GitOps deployment patterns, and observability solutions that enable reliable scaling. Pr
...Building High-Throughput Data Orchestration: Instacart's Journey to 20M Daily Workflows | Explore how Instacart built an enterprise-grade orchestration system handling 20 million daily workflows across diverse technical domains. Learn implementation details of their cloud-native platform combining Apache Airflow and Temporal for robust scheduling and execution. Deep dive into YAML-based workflow definitions, GitOps deployment patterns, and observability solutions that enable reliable scaling. Pr
...
- GenAI Applications

- Lightning Talks

- AI & Data Culture

- Databases

- Lightning Talks

- Workshops
WHY ATTEND?
Go beyond just conference talks and engage directly with our community.
- Expo Hall & Networking
- Interactive Workshops
- Speaker Office Hours
- Drinks & Demos
Expo Hall & Networking

Discover cutting-edge tools and technologies from innovators at the forefront of AI & data. Explore, connect, and get a firsthand look at what’s next.
Interactive Workshops

Why pay extra to level up your career? Gain practical training on the latest data tools from the architects & builders of the tools themselves. (All workshops Included in Ticket Price)
Speaker Office Hours

Our speakers provide real technical depth and go beyond whitepaper-level details. Office Hours sessions with speakers follow each talk and feature additional in-depth discussion opportunities for attendees.
Drinks & Demos

Rub shoulders with the brightest minds in AI & data. Come to make meaningful connections with startups, customers, peers, investors & more.
AI Launchpad
Join us on Day 1 during our 🥳 Community Party to hear from 6 exceptional AI startups.
Brought to you by Zero Prime Ventures.







3-Day Conference Passes
Startup Ticket
$799
Founder-Friendly Pricing
Our special discounted rate for
companies that have raised <$5M
Regular Ticket
$1999
Discounted Group Pricing
Buy 5 Tickets = $1,200/each
Buy 10 Tickets = $1,000/each
Investor Ticket
$4999
(They can afford it 💸)
Investor tickets help subsidize our
Startup Tickets. Thank you :)
💥 INCLUDED IN ALL TICKETS 💥
3 Full Days • Free Workshops • Speaker Office Hours • Community Party • Data Council T-shirt & Tote
• Breakfast, Lunch & Snacks • Coffee & Drinks • Locally Sourced Food • Talk Recordings • Fun Networking
👮♀️ NEED MANAGER APPROVAL?
Check out our Convince Your Boss Email Template











See You in Oakland!
This year, we're excited to call the historic Oakland Scottish Rite Center home to Data Council 2025.
Nestled on Lake Merritt with stunning lake views, this architectural gem puts you steps from downtown's best hotels, dining, and nightlife. Just 15 minutes from BART or a scenic ferry ride from downtown San Francisco.
547 Lakeside Dr, Oakland, CA, 94612
🎟️ BIG TEAM = BIG DEAL 💸
Data Council is more fun with friends. Save 40% OFF when you purchase a 5 pack, and 50% OFF with a 10 pack.











Oakland
The Temple of Data
April 22 - 24, 2025
Oakland Scottish Rite Center
Why Attend Data Council?
Learn from Industry Experts
Get architectural insights and best practices straight from the pioneers building the future of data & AI, no marketing fluff here.
Hands-On Experiences
Put theory into practice through interactive workshops and learning opportunities, such as our unique office hours where you can meet any speaker in a small group setting.
Unparalleled Networking
Get exclusive access and connect with engineers and founders who speak your language. No suits and sales pitches, just real pros sharing their work.
Meet the Hosts
Content quality sets Data Council apart. Unlike other conferences that simply accept abstracts as-is, our track hosts go the extra mile to hand select presentations and collaborate with speakers on their topics to ensure the highest value talks take the stage.

Bryan Bischof
Head of AI
Theory Ventures

Carlos Aguilar
Founder
Hashboard

Daniel Francisco
Director of Product
Meta

Maggie Hays
Community Product Manager
Acryl Data
.webp)
Roger Magoulas
Principal
Almost Data

Sai Srirampur
Principal Engineer
Clickhouse

Scott Breitenother
Founder
Brooklyn Data

Sean Anderson
Head of Product Marketing
Vectara

Sean Taylor
Data Scientist
OpenAI

Swyx (Shawn) Wang
Co-Host
Latent.Space Podcast

Tristan Zajonc
CEO & Co-Founder
Continual
About Our Tracks
Our carefully curated tracks balance proven technical foundations with emerging data & AI trends. Get real frameworks, techniques and actionable knowledge straight from seasoned practitioners.
Data Eng & Infrastructure
AI Engineering
Data Science & Algos
GenAI Applications
Analytics & BI
MLOps & Platforms
Foundation Models
Databases
AI & Data Culture
Lightning Talks
FAQ
Yes! We <3 teams at Data Council and offer streamlined packages for groups of 5 or 10 with huge savings of up to 40-50% off regular ticket prices. Best of all, you can purchase them directly with no invoicing or back-and-forth needed with a sales rep. Simply visit our ticketing site to learn more about group rates.
Yes, we offer discounts for startups (must have raised <$5M), non-profits, government agencies and academic students & faculty. For startups, please see our ticketing site and for non-profit & academic, please contact community@datacouncil.ai for more information.
We're excited to bring Data Council back to the Bay Area on Apr 22-24, 2025! The event will be held at the historic Oakland Scottish Rite Center, right off the shores of beautiful Lake Merritt in Oakland, CA.
Once you purchase your ticket, all talks, workshops and networking opportunities are available to you as part of the Data Council experience. However, external costs such as travel, lodgings and commuting are your responsibility.
Meet the Team

Pete Soderling
Founder & GP at Zero Prime Ventures

Yang Tran
Partner at Zero Prime Ventures

Tim Wu
Head of Marketing at Data Council

Missy Bass
Events Director at Data Council

Gillian Jarvis
Design Advisor at Data Council
Discover the data foundations powering today's AI breakthroughs. Join leading minds as we explore both cutting-edge AI and the infrastructure behind it.