I would like to vectorize some operations in arrays that are not actually available in ndarray or pandas dataframes/series, such as comparing element-wise two arrays/series/frame of similar shape, one of these containing a value and the other one a list.
Can I write my own data structure in python which use vectorization? How can I call methods from the general library which vectorize, or use fundamental functions to do this ?
Is there no way to vectorize in Python without relying on Fortran/C hence on SciPy/NumPy/Pandas and similar librairies? I don't get why Python by itself would not be able to manage arrays just like C, actually? It is "impossible by conception", or is it just not done because offloads are better?
For example, vectorization is performed when adding two arrays in numpy such as ndarray_1([1,2,3]) and ndarray_2([3,2,6]) which will give ndarray_3([4,4,9]) in one step, and there is no invisible loop, actually all the operations happens in one step in memory.
I would like to know how basically code such vectorized operation in python (without using numpy, for my enlightment).
Similarly to what is described above, this would be especially useful to me to know that as I know there are functions exploiting the properties of the special words is and in, such as isin in pandas, which compare a whole series to see whether individual elements are contained in the provided iterator.
Unhappily, if I have an array of lists, I have to use a loop to pass over this array to provide sequentially the iterator to be compared in "isin".
This is not good at all for my application.
Other use cases would be getting rid of such functions as map and apply, which are disguised for loops with optimizations, to go towards true vectorization.
Like applying in one round on a series or frame, element wise, testing of the instance type (isinstance), functions depending on a mathematical formula but also a condition (though, here, I could perform the vectorized mathematical f(x) then in another step apply the boolean formula), and so on.
There are tons of use case, indeed I know that I can do some with Numpy/Pandas.
But, First, I can't do all that I want to do (such as finding if an element of a frame is in a list of a comparable frame, element-wise, as stated above, which would be tremendously useful), and it obligates me to curb into steampunk code (which works by some weird transformation of known science, but might not if you look at it closely and skeptically enough, and which is anyway ugly and convoluted) to get away with it.
Plus these solutions are not always efficient, and when they are, they only are moderately.
Second, I want to learn to become a better programmer. That means not only relying on the work previously done without understanding it, or relying on hacks or derivative like pushing everything into cython, numba, C and Fortran under the hood.
If true vectorized approaches are feasible in python, even if less performant because of the specificities of the language (anyway, we all know that basically, anything out here is slower that C, C++ and Fortran apart maybe a few new languages that are not interpreted but compiled), I would like to learn to know how, as part of improving my skill and understanding of programming.
Hence my question. Thanks for helping me doing it better.