Векторизація обчислень для оптимізації коду на мові програмування Python
Journal Title: Challenges and Issues of Modern Science - Year 2024, Vol 3, Issue 1
Abstract
Purpose. The purpose of this study is to explore vectorization as an engineering technique to improve the performance and readability of Python code, particularly in data processing tasks. We aim to demonstrate the benefits of vectorization through practical examples involving the handling of missing data. Design / Method / Approach. To achieve the research goals, we performed a comparative analysis between loop-based and vectorized implementations. Specifically, two versions of a function were developed to identify columns containing missing values within a dataset. These implementations were tested on two real-world datasets. We compared execution time and code readability. Findings. The findings showed that vectorization resulted in substantial performance improvements, reducing execution time by hundreds of times compared to traditional loop-based methods. Additionally, the vectorized code was more compact, leading to greater readability and ease of maintenance. Theoretical Implications. Vectorization provides a higher level of abstraction for performing operations on data structures. This allows developers to focus on algorithmic logic rather than managing iterative control structures, contributing to broader discussions on optimizing computational efficiency in Python. Practical Implications. For data engineers and analysts, vectorization represents a highly effective solution for optimizing Python code. It significantly accelerates data-intensive tasks, such as missing data imputation, data analysis, and machine learning, making it an essential tool for enhancing productivity in data-driven environments. Originality / Value. This study presents a practical approach to optimizing Python code through vectorization. It is valuable for professionals seeking to improve efficiency in their workflows. Research Limitations / Future Research. The limitation of this research lies in its focus on a single problem – missing data imputation. Future research should expand the scope to other computational areas, such as image processing and simulation modeling, or examine the use of vectorization alongside Just-In-Time (JIT) compilation using tools like Numba to further boost Python's performance. Paper Type. Practitioner Paper.
Authors and Affiliations
Олексій Земляний, Олег Байбуз
Кібербезпека критичної інфраструктури: виклики інновацій і загрози цифрових технологій
Purpose. The article aims to explore the characteristics, development prospects, and potential threats associated with the cybersecurity of critical infrastructure. As critical systems become increasingly dependent on di...
Microplastics in agricultural soils: sources and microbial remediation approaches
Purpose. The purpose of this study was theoretical analysis of the sources of microplastics in agricultural soils, its impact on agroecosystems and microbial remediation approaches to remove microplastics from the soil....
Розробка автоматизованої системи управління температурним режимом випікання хлібобулочних виробів із використанням нечіткого контролера
Purpose. The purpose of the work is to improve the main indicator of the economic efficiency of the production of bakery products - the saving of electricity for heating inside the chamber. Design / Method / Approach. Ac...
Методика вибору кроку дискретизації моделі в інформаційно-вимірювальних технологіях
Purpose. The main aim of the research is to produce recommendations in the form of a methodology that permits designers to establish a discretization step for a simulation model, when that model is being prepared to be i...
Багатогранність токенів у підприємницьких екосистемах: нові горизонти аксіології цифрових активів
Purpose. This study aims to explore the multifaceted nature of tokens in entrepreneurial ecosystems and their impact on value creation and distribution processes, as well as to develop a conceptual framework integrating...