Retrieve column info for parquet files based on original file type
Source:R/utils.R
parquet_colinfo.Rd'parquet_colinfo' returns the column info associated with a parquet file made from a particular output file type.
Value
Data frame with columns 'general_data_type', 'col_name', 'col_class', 'description', 'se_role', and 'position'
Examples
parquet_colinfo("viral_clusters")
#> general_data_type col_name col_class
#> 1 viral_clusters m_group_cluster character
#> 2 viral_clusters genome_name character
#> 3 viral_clusters length int
#> 4 viral_clusters breadth_of_coverage float
#> 5 viral_clusters depth_of_coverage_mean float
#> 6 viral_clusters depth_of_coverage_median float
#> 7 viral_clusters m_group_type_k_u character
#> 8 viral_clusters first_genome_in_cluster character
#> 9 viral_clusters other_genomes character
#> 10 viral_clusters uuid character
#> 11 viral_clusters db_version character
#> 12 viral_clusters command character
#> 13 viral_clusters metaphlan_header character
#> 14 viral_clusters original_columns character
#> description
#> 1 The marker gene group or cluster detected
#> 2 The database identifier of the matched genome
#> 3 Length of the genome (in base pairs) in the database
#> 4 The proportion of the genome covered by reads
#> 5 The average sequencing depth across the genome
#> 6 The median sequencing depth across the genome
#> 7 Whether the matched genome has known (kVSG) or unknown (uVSG) taxonomy
#> 8 The first genome in the matched cluster, if that cluster's taxonomy is known
#> 9 Other genomes in the same cluster that share the same marker set
#> 10 Sample UUID
#> 11 MetaPhlAn database version(s) referenced
#> 12 MetaPhlAn command given
#> 13 MetaPhlAn's custom header row
#> 14 Original MetaPhlAn column names
#> se_role ref_file
#> 1 rdata <NA>
#> 2 rname genome_name_ref
#> 3 rdata <NA>
#> 4 assay <NA>
#> 5 assay <NA>
#> 6 assay <NA>
#> 7 rdata <NA>
#> 8 rdata <NA>
#> 9 rdata <NA>
#> 10 cname <NA>
#> 11 cdata <NA>
#> 12 cdata <NA>
#> 13 cdata <NA>
#> 14 cdata <NA>