I repeated the steps which are described in this question
https://gis.stackexchange.com/questions/317916/gdalbuiltvrt-output-is-different-than-the-source-images-when-using-dstalpha and it indeed seems that transparency that is defined with an alpha band in original images is lost if I read data through VRT. I
wonder if something similar happens than in this issue
https://github.com/mapbox/rasterio/issues/1454 with GDALRasterBand::GetMaskBand().