Learning to Learn:Training Language Models to Understand Tasks from Few Examples

  Machine learning

Abstract:
As AI models become an increasingly common element of many applications, we more notoriously face practical limitations of specialized models working well only for a single training task and data.
However, huge neural models like OpenAI's GPT-3 proved that models could be much more versatile and adapt to new tasks on request, provided only with natural instructions and a small number of input-output examples.

This talk will show you that models' ability to learn new tasks without updates is no longer privileged to the English language, and big tech companies maintaining whole data centres to perform a single prediction.
We will present a real-world application of our general-purpose models for detecting entities in the previously-unseen domain of Czech texts.
You will see that those general models running even on your laptop perform better than specialized RoBERTa trained on 3000 manual annotations.
We release our models and training scripts publicly available for any use and discuss the business implications that this alteration of essential paradigms of Machine Learning brings to the future.

About the speaker:
Michal Štefánik is a senior language specialist in the NLP team at Gauss Algorithmic and a PhD researcher at Masaryk University.
Throughout the last six years in NLP, he led the deployment of deep language models to numerous NLP applications built upon Named Entity Recognition or Neural Machine Translation models.
Michal conducts research on enhancing the robustness of large language models, including generalization to unseen tasks.
He is also a founder of the students' Transformers Club, whose members received international prizes, including first place in Meta's NAACL DADC competition.
https://michal-stefan...

Program:

* 17:00 Welcome chat
* 18:00 Talk
* 18:45 Discussion
* 19:00 Networking (Otevřená zahrada)

Machine Learning Meetups (MLMU) is an independent platform for people interested in Machine Learning, Information Retrieval, Natural Language Processing, Computer Vision, Pattern Recognition, Data Journalism, Artificial Intelligence, Agent Systems and all the related topics. MLMU is a regular community meeting usually consisting of a talk, a discussion and subsequent networking. Except of Prague, MLMU also spread to Brno, Bratislava and Košice.
https://www.mlmu.cz/...

Zdarma