PanOryza-pan-genes-release-v1.0

This repository hosts the code for recreating analyses in the PanOryza manuscript. Code for GET_PANGENES available from: https://github.com/Ensembl/plant-scripts/blob/master/pangenes/. The code for Nipponbare merged genes is available from: https://github.com/Ensembl/plant-scripts/tree/master/scripts. The input files( .fasta and .gff format) for running GET_PANGENES are available from zenodo (https://zenodo.org/records/14772953). Else, the output files for Os4530.POR.1 (version 1.0) are also available at the zenodo repository and can be used for various downstream analyses of the pan-genes using the code available here.

To reproduce the entire analyses starting with the GET_PANGENES result, prepare various tables and intermediate files to recreate manuscript figures.

Output of get_pangenes using RPRP (MAGIC-16 accessions) as input gives out the following set of files:

.cluster_list --> parsed in tabular format using function parse_clusters --> output table named as "df_merged"
.matrix_genes.tr.tab --> read directly as table named "pangene_list"
.matrix.tr.tab
Individual clusters inside folder 'oryzasativanipponbaremerged' --> *.cds.faa files of clusters used to calculate and summarise clusters and individual protein lengths. Clusters sequence summary can be created in R using create_cluster_sum. NOTE: There are also several ways to do this using a Linux terminal. The resulting clusters sequence summary can be further parsed into a dataframe using read_parse_clusters_summary

Additional "cluster_merged" named table used at various places, created by combining "pangene_list" and "df_merged"

Interproscan tabular results for magic18 protein sequences were merged with the cluster files above. Recommended to load the workspace core_workspace.RData in R/Rstudio that will also load these Interproscan results for pan-genes (Available at zenodo). Else, core_files.R can be used to read all these files needed for downstream analysis.

To repoduce the figure-wise analysis, please refer to the scripts folder

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
core_files		core_files
heatmap_app		heatmap_app
scripts		scripts
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PanOryza-pan-genes-release-v1.0

To reproduce the entire analyses starting with the GET_PANGENES result, prepare various tables and intermediate files to recreate manuscript figures.

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

PanOryza-pan-genes-release-v1.0

To reproduce the entire analyses starting with the GET_PANGENES result, prepare various tables and intermediate files to recreate manuscript figures.

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages