Track 4: Data Science and Machine Learning

On Reasoning About Black-Box Udfs by Classifying their Performance Characteristics

Michal Bodziony, IBM Software Lab Kraków, PolandFollow
Bartosz Ciesielski, Poznań University of Technology, PolandFollow
Anna Lehnhardt, Poznań University of Technology, PolandFollow
Robert Wrembel, Poznań University of Technology, PolandFollow

Abstract

User defined functions (UDFs) are frequent components of SQL queries and data processing workflows (DPWs). In both of these applications, UDFs are often available as black boxes, i.e., their semantics and performance characteristics are unknown (such functions are further called BBUDFs). This feature prevents from optimizing execution plans of queries and from optimizing the whole DPWs. Discovering the semantics of a BBUDF is often impossible due to high complexity of its code. On the contrary, discovering its performance model seems to be feasible with the support of machine learning. In this paper, we present a solution for classifying BBUDFs into performance classes. This way, if a performance class of a given BBUDF is known, it may allow to reason about some hidden features of the BBUDF. Our solution is supported by experimental evaluation, which reveals that our initial approach, in multiple cases, allows to classify BBUDFs to adequate performance classes.

Recommended Citation

Bodziony, M., Ciesielski, B., Lehnhardt, A. & Wrembel, R. (2024). On Reasoning About Black-Box Udfs by Classifying their Performance Characteristics. In B. Marcinkowski, A. Przybylek, A. Jarzębowicz, N. Iivari, E. Insfran, M. Lang, H. Linger, & C. Schneider (Eds.), Harnessing Opportunities: Reshaping ISD in the post-COVID-19 and Generative AI Era (ISD2024 Proceedings). Gdańsk, Poland: University of Gdańsk. ISBN: 978-83-972632-0-8. https://doi.org/10.62036/ISD.2024.83

Paper Type

Full Paper

DOI

10.62036/ISD.2024.83

References_DOI_ISD.2024.83.pdf (161 kB)

Download

COinS

On Reasoning About Black-Box Udfs by Classifying their Performance Characteristics

Track 4: Data Science and Machine Learning

On Reasoning About Black-Box Udfs by Classifying their Performance Characteristics

Abstract

Recommended Citation

Paper Type

DOI

Search

Browse

Author Corner

Links

Track 4: Data Science and Machine Learning

On Reasoning About Black-Box Udfs by Classifying their Performance Characteristics

Presenter Information

Abstract

Recommended Citation

Paper Type

DOI

Share

Search

Browse

Author Corner

Links