Online MLMU #4: Large-scale forecasting & Free text clustering

  Machine learning

Our first meetup in June welcomes two speakers, both presenting interesting applications of ML in real-world. Michał will demonstrate their work on time series in a large scale and Tetyana will share DHL's approach to process plenty of unstructured text.

The meetup will be hosted online - using Zoom platform (and streamed at YouTube).

Talk: Developing and deploying a large-scale inventory optimization system (Michał Kurcewicz)

Abstract: We present a large-scale forecasting and inventory optimization system that uses AI techniques to generate a demand forecast and optimize replenishment orders. The use of machine learning techniques allows retailers to fully utilize the data available in a modern organization. These data include not only detailed sales and promotion information but also data from the company’s loyalty program, product database, e-commerce system, information on local events as well as data supplied by expert business users and external sources. We discuss a comprehensive forecasting and order optimization pipeline. It includes data cleansing, creation of features and time-series segmentation based on the characteristics like seasonality or intermittency. Next, we discuss the recommended forecasting approach for each segment. Finally, we present the order optimization process. The discussed system has been implemented at a significant Polish retailer.

Bio: Michał did his PhD in Economics at the University of Warsaw. Currently, he works at SAS as Principal Analytical Consultant.ł-kurcewicz-7b89745/

Talk: Free text clustering in practice (Tetyana Holets)

Abstract: Every working day in DHL brings millions of records, which include free unstructured text. Those are working emails, chat conversations, description of goods, various surveys, descriptions of the incidents in the Data Center etc. It is close to impossible for a human to process all those texts efficiently. Therefore, we are developing a tool called Topicon, which aims to help and simplify the everyday work of the teams who are working with free texts. The main goal of Topicon is topic modelling so we can have an overview of the main issues discussed in the company. Its functionality includes text preprocessing, vectorization, sentiment analysis, and application of various clustering techniques. Its output is interactive visualization, which aims to preserve the relation between the clusters and to provide the user with both high level and more detailed overview of the main topics. We will present Topicon’s functionally using the data from DHL Employee Opinion Survey.

Bio: Tetyana did her Master's in Economics and Econometrics in CERGE-EI. Currently, she works at DHL IT Services as a Machine Learning Specialist. Her interests are in NLP, time series prediction, automated data labelling and augmentation and mathematics.