Big data solutions not only have to handle large volumes of data, but also (1) enable analysis within predictable time frames and (2) offer the possibility to optimize prediction quality for algorithm families and various pre- and post-processing steps. In the talk, I will first discuss algorithms that offer fast response times in the framework of stream mining and online learning. As an example, I will discuss the system EDO (estimation of densities online), which supports arbitrary queries on joint probabilities of mixed discrete and continuous variables of a data stream. EDO estimates densities in an online fashion, allows mining patterns in the data stream and explicitly models and recognizes recurring distributions. The second part of the talk will focus on prediction quality. I will discuss the optimization of prediction quality in the Scavenger system, which is a flexible tool for testing algorithm families and reusing intermediate results of computations in incremental machine learning and data mining schemes. I will show use cases of Scavenger in the area of multi-label classification and Boolean matrix factorization.
Stefan Kramer is full professor and head of department of the institute of computer of Johannes Gutenberg University (JGU) Mainz. Before his appointment at JGU, he was associate professor at the computer science department of Technische Universität München (2003 to 2011). He has been active in the field of data mining since the first conference worldwide in 1995 and is author of award-winning papers at ICDM, KDD and ILP. He was vice-chair of ICDM 2013 and is regularly area chair of conferences like ECML/PKDD. His research interests include mining structured data, stream mining, process mining and clustering.
Last modified: Wednesday, 15-Mar-2017 15:54:04 NZDT
This page is maintained by the seminar list administrator.