Conserved regulation of RNA processing in somatic cell reprogramming

Data set 1. Transcript expression across human RNA-Seq samples: estimated read counts. The file contains estimated read counts, generated by kallisto (https://pachterlab.github.io/kallisto/), for human transcripts and RNA-Seq samples used in this study (see Additional file 2 of the accompanying publication). The format is a compressed (GZIP) tab-separated transcript-by-sample matrix. Ensembl transcript identifiers and a combined Sequence Read Archive study/sample name identifier serve as row and column names, respectively.

Data set 2. Transcript expression across murine RNA-Seq samples: estimated read counts. As in Data set 1, but for mouse transcripts.

Data set 3. Transcript expression across simian RNA-Seq samples: estimated read counts. As in Data set 1, but for chimpanzee transcripts.

Data set 4. Transcript expression across across human RNA-Seq samples: estimated transcript abundances. As in Data set 1, but instead of read counts, transcript abundances in transcripts per million (TPM), as estimated by kallisto (https://pachterlab.github.io/kallisto/), are listed. Format, column and row names as in Data set 1.

Data set 5. Transcript expression across murine RNA-Seq samples: estimated transcript abundances. As in Data set 4, but for mouse transcripts.

Data set 6. Transcript expression across simian RNA-Seq samples: estimated transcript abundances. As in Data set 4, but for chimpanzee transcripts.

Data set 7. Differential expression analyses across human RNA-Seq sample groups: log fold changes. The file contains log fold changes, inferred by edgeR (http://bioconductor.org/packages/release/bioc/html/edgeR.html), for human genes and the RNA-Seq sample group contrasts listed in Additional file 3 of the accompanying publication in a compressed (GZIP) TSV gene-by-comparison matrix. Ensembl gene identifiers and a descriptive contrast identifier serve as row and column names, respectively.

Data set 8. Differential expression analyses across murine RNA-Seq sample groups: log fold changes. As in Data set 7, but for mouse genes.

Data set 9. Differential expression analyses across simian RNA-Seq sample groups: log fold changes. As in Data set 7, but for chimpanzee genes.

Data set 10. Differential expression analyses across human RNA-Seq sample groups: false discovery rates. The file contains false discovery rates (FDR) for the differential expression analyses summarized in Data set 7. Format, column and row names as in Data set 7.

Data set 11. Differential expression analyses across murine RNA-Seq sample groups: false discovery rates. As in Data set 10, but for mouse genes.

Data set 12. Differential expression analyses across simian RNA-Seq sample groups: false discovery rates. As in Data set 10, but for chimpanzee genes.

Data set 13. Quantification of alternative splicing events across human RNA-Seq samples. The file contains ‘percent spliced in’ (PSI) values computed by SUPPA (https://github.com/comprna/SUPPA) for annotated alternative splicing events (inferred from the transcript annotation of the human genome, Ensembl release 84; http://www.ensembl.org/). The format is a compressed (GZIP) tab-separated transcript-by-sample matrix. SUPPA-provided event identifiers and a combined Sequence Read Archive study/sample name identifier serve as row and column names, respectively.

Data set 14. Quantification of alternative splicing events across murine RNA-Seq samples. As in Data set 13, but for mouse alternative splicing events.

Data set 15. Differential splicing analyses across human RNA-Seq sample groups: differences in ‘percent spliced in’ (ΔPSI). The file contains ΔPSI values for human alternative splicing events (as in Data set 13). The RNA-Seq sample group contrasts are listed in Additional file 3 of the accompanying publication. Values were inferred by SUPPA’s diffSplice functionality (https://github.com/comprna/SUPPA). The format is a compressed (GZIP) tab-separated gene-by-comparison matrix. SUPPA event identifiers and a descriptive contrast identifier serve as row and column names, respectively.

Data set 16. Differential splicing analyses across murine RNA-Seq sample groups: differences in ‘percent spliced in’ (ΔPSI). As in Data set 15, but for mouse alternative splicing events.

Data set 17. Differential splicing analyses across human RNA-Seq sample groups: P values. The file contains P values for the differential splicing analysis of human alternative splicing events summarized in Data set 15. Format, column and row names as in Data set 15.

Data set 18. Differential splicing analyses across murine RNA-Seq sample groups: P values. The file contains P values for the differential splicing analysis of mouse alternative splicing events summarized in Data set 16. Format, column and row names as in Data set 15.