To simulate datasets with tuneable batch effects we simulated scRNAseq counts similar as described in muscat. In addition we estimated batch chracteristics from real datasets.
The celltype-specific genewise logFC of the batch effect were used to simulate counts with respective mean expression among cells from each batch. To mimic a realistic relationship between celltypes and batch effects estimates from the real datasets celltypes were used to display celltype variance.
Simulated datasets are shown here: