H2O.ai Prague Meetup #5: Scalable Automatic Machine Learning in H2O

lis27

Středa 27. listopadu 2019

18:00 - 21:00

Tato akce už proběhla.

Původní stránka akce →

O akci

Dear Makers,

We are hosting our fifth meetup on 27th November.

Agenda:
- Doors Open at 6 pm. Refreshments + Networking until 6:30pm.
- Welcoming Remarks by H2O.ai team.
- Tech Talks
- Lucky Draw for ML Prague Conference Ticket
- Networking until 9 pm.

===

Talk 1: Scalable Automatic Machine Learning in H2O by Erin LeDell

The focus of this presentation is scalable and automatic machine learning using the H2O machine learning platform. H2O is an open source, distributed machine learning platform designed for big data. The core machine learning algorithms of H2O are implemented in high-performance Java, however, fully-featured APIs are available in R, Python, Scala, REST/JSON, and also through a web interface. Since H2O's algorithm implementations are distributed, this allows the software to scale to very large datasets that may not fit into RAM on a single machine.

We will provide an overview of the methodology behind H2O's AutoML algorithm. H2O AutoML provides an easy-to-use interface which automates data pre-processing, training and tuning a large selection of candidate models (including multiple stacked ensemble models for superior model performance), and due to the distributed nature of the H2O platform, H2O AutoML can scale to very large datasets. The result of the AutoML run is a "leaderboard" of H2O models which can be easily exported for use in production.

R and Python code with H2O machine learning code examples are available on GitHub for participants to follow along on their laptops.

About Erin:
Erin LeDell is the Chief Machine Learning Scientist at H2O.ai, the company that produces the open source, distributed machine learning platform, H2O. At H2O.ai, she leads the H2O AutoML project and her current research focus is automated machine learning. Before joining H2O.ai, she was the Principal Data Scientist at Wise.io (acquired by GE) and Marvin Mobile Security (acquired by Veracode), the founder of DataScientific, Inc. and a software engineer. She is also founder of the Women in Machine Learning and Data Science (WiMLDS) organization (wimlds.org) and co-founder of R-Ladies Global (rladies.org). Erin received her Ph.D. in Biostatistics with a Designated Emphasis in Computational Science and Engineering from University of California, Berkeley and has a B.S. and M.A. in Mathematics.

Talk 2: Off-Policy Partial Feedback System Reward Estimation in Seznam.cz Web Search Engine by Pavel Prochazka

The talk will be about off-policy evaluation of the contextual multi-armed bandit applied to the vertical search blending problem in Seznam.cz web search engine. We highlight the advantages of the counterfactual off-policy evaluation approach over conventional online A/B testing and introduce basic counterfactual methods such as inverse propensity score (IPS) reward estimator. The counterfactual approach requires properly evaluated propensities for valid off-policy evaluation. The IPS estimate quality (its variance) depends on particular propensity values that are directly related to logging policy exploration.

About Pavel:
Pavel Prochazka is a research engineer at Seznam.cz. He focuses on information retrieval related research applications into Seznam.cz web search engine. In particular, his interests include mainly learning to rank, counterfactual analysis and query understanding. Pavel received his Ph.D. in wireless communications from the Czech Technical University in Prague, where the main research focus was on iterative detection and Bayesian inference in wireless communication systems.
https://www.linkedin.com/in/pavel-prochazka-4725b852/

Místo

Bubenské nábř. 306/13