Biologically informed variational autoencoders allow predictive modeling of genetic and drug-induced perturbations

Presentation date: April 04, 2024

Presenter: Jun Sik Kim


What is PGEE_M?

Penalized Generalized Estimating Equation of Multinomial Responses is a method for identifying important variables and estimation of their regression coefficient simultaneously for high-dimensional longitudinal multinomial responses.
For variable selection as well as for the estimation of high-dimensional longitudinal data, PGEE_M uses two non-convex penalties such as the SCAD and the MCP penalty.
To estimate model parameters, PGEE_M adopts an iterative algorithm, which combines with the minorization-maximization (MM) algorithm for the nonconvex penalty with the Fisher-scoring algorithm.
Detailed algorithm is described in the below original article: “Penalized generalized estimating equations approach to longitudinal data with multinomial responses”.
This PGEE_M software can only produce results for independent correlation structure.
To create the PGEE_M software, we used some part of code from the “PGEE” package ( and the “multgee” package (

Sample Dataset

The sample dataset contains 500 samples, with each subject being evaluated at 4 different time points, a total of 100 covariates and the number of categories of response variable is 5.


The PGEE_M program has developed and maintained by

  • Md. Kamruzzaman ( at Bioinformatics and Biostatistics Lab., Dept. of Statistics in Seoul National University.


An example R script is linked to here
An example data is linked to here



This fold includes R source files for implementation of the numerical study in the manuscript submitted to Bioinformatics, 2014, titled “AucPR: An AUC-based approach using penalized regression for disease prediction with high-dimensional omics data.”.


The following source codes are included:

AucPR.R —— List all functions needed.

Simu_Setting.R —— Generate setting for simulations.

Case_Study.R ——- A simulation study and a real example study are considered.

mhsauc_tgdr.f90 —– The fortran code to implement Ma & Huang’s method (MSauc), and mhsauc_tgdr.dll is its dll version. We call it from R.

For detail, please see the R codes.


You can download a zipped file contains source codes and this README from this link : Codes_Penalized_AUC