Data Types


A major component of our work involves data wrangling: the processes spanning data acquisition to production and delivery of data sets in analyzable formats. The art of data wrangling requires considerable data type specific know-how including an understanding of what data can accurately measure for a system, as well as what processing is necessary to optimally enable analyses that can return actionable insights and high value predictions. In the MezeyLab, we most frequently work with genomic and other high-throughput molecular data types, although we are increasingly working with medical imaging data, clinical data from trials, real world evidence (RWE) and electronic health record (EHR) data. 

In the MezeyLab, a particular area of expertise is processing and integrating the same data sources for more than a single application - where we are often able to suggest such additional uses of data to our collaborators that go beyond their original interest. All of our data wrangling work is heavily integrated with our research and development of computational analysis methodology for a broad spectrum of applications. Please see our publications and the following pages for representative examples of our work with various data types including:

  • Genomic - and Other High Dimensional Molecular Data, including microarray genotypic, exome and whole-genome next-generation sequencing, epigenomic including Methyl-Seq and ATAC-Seq, genome-wide expression including whole transcriptome as well as non-coding RNA profiles at both tissue and single-cell resolution, high throughput B cell receptor (BCR) sequencing, as well as proteome and metabolome measured by antibody arrays and mass spectrometry


  • Medical Image data including digitized pathology slides, confocal images, Computed Tomography (CT) imaging of cancer, and MRI profiling of disease


  • Clinical Data from clinical trials and studies, as well as clinically relevant information extracted from Real World Evidence (RWE) data sources


  • Electronic Health Record (EHR) data including patient records extracted from the major EHR platforms and other EHR data sources