Title: | Streamline Population Genomic and Genetic Analyses |
---|---|
Description: | Estimate commonly used population genomic statistics and generate publication quality figures. 'PopGenHelpR' uses vcf, 'geno' (012), and csv files to generate output. |
Authors: | Keaka Farleigh [aut, cph, cre]
|
Maintainer: | Keaka Farleigh <[email protected]> |
License: | GPL (>= 3) |
Version: | 1.3.2 |
Built: | 2025-02-26 04:46:14 UTC |
Source: | https://github.com/kfarleigh/popgenhelpr |
Plot an ancestry matrix for individuals and(or) populations.
Ancestry_barchart( anc.mat, pops, K, plot.type = "all", col, ind.order = NULL, pop.order = NULL )
Ancestry_barchart( anc.mat, pops, K, plot.type = "all", col, ind.order = NULL, pop.order = NULL )
anc.mat |
Data frame or character string that supplies the input data. If it is a character string, the file should be a csv. The first column should be the names of each sample/population, followed by the estimated contribution of each cluster to that individual/pop. |
pops |
Data frame or character string that supplies the input data. If it is a character string, the file should be a csv. The columns should be named Sample, containing the sample IDs; Population indicating the population assignment of the individual, population and sample names must be the same type (i.e., both numeric or both characters); Long, indicating the longitude of the sample; Lat, indicating the latitude of the sample. |
K |
Numeric.The number of genetic clusters in your data set, please contact the package authors if you need help doing this. |
plot.type |
Character string. Options are all, individual, and population. All is default and recommended, this will plot a barchart for both the individuals and populations. |
col |
Character vector indicating the colors you wish to use for plotting. |
ind.order |
Character vector indicating the order to plot the individuals in the individual ancestry bar chart. |
pop.order |
Chracter vector indicating the order to plot the populations in the population ancesyry bar chart. |
A list containing your plots and the data frames used to generate the plots.
Keaka Farleigh
data(Q_dat) Qmat <- Q_dat[[1]] rownames(Qmat) <- Qmat[,1] Loc <- Q_dat[[2]] Test_all <- Ancestry_barchart(anc.mat = Qmat, pops = Loc, K = 5, plot.type = 'all',col = c('#d73027', '#fc8d59', '#e0f3f8', '#91bfdb', '#4575b4'))
data(Q_dat) Qmat <- Q_dat[[1]] rownames(Qmat) <- Qmat[,1] Loc <- Q_dat[[2]] Test_all <- Ancestry_barchart(anc.mat = Qmat, pops = Loc, K = 5, plot.type = 'all',col = c('#d73027', '#fc8d59', '#e0f3f8', '#91bfdb', '#4575b4'))
A function to estimate three measures of genetic differentiation using geno files, vcf files, or vcfR objects. Data is assumed to be bi-allelic.
Differentiation( data, pops, statistic = "all", missing_value = NA, write = FALSE, prefix = NULL, population_col = NULL, individual_col = NULL )
Differentiation( data, pops, statistic = "all", missing_value = NA, write = FALSE, prefix = NULL, population_col = NULL, individual_col = NULL )
data |
Character. String indicating the name of the vcf file, geno file or vcfR object to be used in the analysis. |
pops |
Character. String indicating the name of the population assignment file or dataframe containing the population assignment information for each individual in the data. This file must be in the same order as the vcf file and include columns specifying the individual and the population that individual belongs to. The first column should contain individual names and the second column should indicate the population assignment of each individual. Alternatively, you can indicate the column containing the individual and population information using the individual_col and population_col arguments. |
statistic |
Character. String or vector indicating the statistic to calculate. Options are any of: all; all of the statistics; Fst, Weir and Cockerham (1984) Fst; NeisD, Nei's D statistic; JostsD, Jost's D. |
missing_value |
Character. String indicating missing data in the input data. It is assumed to be NA, but that may not be true (is likely not) in the case of geno files. |
write |
Boolean. Whether or not to write the output to files in the current working directory. There will be one or two files for each statistic. Files will be named based on their statistic such as Fst_perpop.csv. |
prefix |
Character. Optional argument. String that will be appended to file output. Please provide a prefix if write is set to TRUE. |
population_col |
Numeric. Optional argument (a number) indicating the column that contains the population assignment information. |
individual_col |
Numeric. Optional argument (a number) indicating the column that contains the individuals (i.e., sample name) in the data. |
A list containing the estimated heterozygosity statistics. The per pop values are calculated by taking the average of the per locus estimates.
Keaka Farleigh
Fst:
Pembleton, L. W., Cogan, N. O., & Forster, J. W. (2013). StAMPP: An R package for calculation of genetic differentiation and structure of mixed‐ploidy level populations. Molecular ecology resources, 13(5), 946-952.doi:10.1111/1755-0998.12129
Weir, B. S., & Cockerham, C. C. (1984). Estimating F-statistics for the analysis of population structure. evolution, 1358-1370.
Nei's D:
Nei, M. (1972). Genetic distance between populations. The American Naturalist, 106(949), 283-292.doi:10.1086/282771
doi:10.1111/1755-0998.12129 Pembleton, L. W., Cogan, N. O., & Forster, J. W. (2013). StAMPP: An R package for calculation of genetic differentiation and structure of mixed‐ploidy level populations. Molecular ecology resources, 13(5), 946-952.
Jost's D:
Jost L (2008). GST and its relatives do not measure differentiation. Molecular Ecology, 17, 4015–4026.doi:10.1111/j.1365-294X.2008.03887.x
data("HornedLizard_Pop") data("HornedLizard_VCF") Test <- Differentiation(data = HornedLizard_VCF, pops = HornedLizard_Pop, write = FALSE)
data("HornedLizard_Pop") data("HornedLizard_VCF") Test <- Differentiation(data = HornedLizard_VCF, pops = HornedLizard_Pop, write = FALSE)
A symmetric matrix with estimated genetic differentiation (Fst) between 3 populations.
data(Fst_dat)
data(Fst_dat)
A list with two elements:
Data frame with three rows and three columns
Data frame containing the locality information for each population
...
Farleigh, K., Vladimirova, S. A., Blair, C., Bracken, J. T., Koochekian, N., Schield, D. R., ... & Jezkova, T. (2021). The effects of climate and demographic history in shaping genomic variation across populations of the Desert Horned Lizard (Phrynosoma platyrhinos). Molecular Ecology, 30(18), 4481-4496.
data(Fst_dat) Fst <- Fst_dat[[1]] Loc <- Fst_dat[[2]] Test <- Network_map(dat = Fst, pops = Loc, neighbors = 2,col = c('#4575b4', '#91bfdb', '#e0f3f8','#fd8d3c','#fc4e2a'), statistic = "Fst", Lat_buffer = 1, Long_buffer = 1) Fstat_plot <- Pairwise_heatmap(dat = Fst, statistic = 'FST')
data(Fst_dat) Fst <- Fst_dat[[1]] Loc <- Fst_dat[[2]] Test <- Network_map(dat = Fst, pops = Loc, neighbors = 2,col = c('#4575b4', '#91bfdb', '#e0f3f8','#fd8d3c','#fc4e2a'), statistic = "Fst", Lat_buffer = 1, Long_buffer = 1) Fstat_plot <- Pairwise_heatmap(dat = Fst, statistic = 'FST')
Data frame containing 5 columns and 3 rows
data(Het_dat)
data(Het_dat)
A data frame with 5 columns and 3 rows:
Estimated heterozygosity
Population assignment
standard deviation
Longitude
Latitude
...
Coordinates and population names taken from Farleigh, K., Vladimirova, S. A., Blair, C., Bracken, J. T., Koochekian, N., Schield, D. R., ... & Jezkova, T. (2021). The effects of climate and demographic history in shaping genomic variation across populations of the Desert Horned Lizard (Phrynosoma platyrhinos). Molecular Ecology, 30(18), 4481-4496.
data(Het_dat) Test <- Point_map(Het_dat, statistic = "Heterozygosity")
data(Het_dat) Test <- Point_map(Het_dat, statistic = "Heterozygosity")
A function to estimate seven measures of heterozygosity using geno files, vcf files, or vcfR objects. Data is assumed to be bi-allelic.
Heterozygosity( data, pops, statistic = "all", missing_value = NA, write = FALSE, prefix = NULL, population_col = NULL, individual_col = NULL )
Heterozygosity( data, pops, statistic = "all", missing_value = NA, write = FALSE, prefix = NULL, population_col = NULL, individual_col = NULL )
data |
Character. String indicating the name of the vcf file, geno file or vcfR object to be used in the analysis. |
pops |
Character. String indicating the name of the population assignment file or dataframe containing the population assignment information for each individual in the data. This file must be in the same order as the vcf file and include columns specifying the individual and the population that individual belongs to. The first column should contain individual names and the second column should indicate the population assignment of each individual. Alternatively, you can indicate the column containing the individual and population information using the individual_col and population_col arguments. |
statistic |
Character. String or vector indicating the statistic to calculate. Options are any of: all; all of the statistics; Ho, observed heterozygosity; He, expected heterozygosity; PHt, proportion of heterozygous loci; Hs_exp, heterozygosity standardized by the average expected heterozygosity; Hs_obs, heterozygosity standardized by the average observed heterozygosity; IR, internal relatedness; HL, homozygosity by locus. |
missing_value |
Character. String indicating missing data in the input data. It is assumed to be NA, but that may not be true (is likely not) in the case of geno files. |
write |
Boolean. Whether or not to write the output to files in the current working directory. There will be one or two files for each statistic. Files will be named based on their statistic such as Ho_perpop.csv or Ho_perloc.csv. |
prefix |
Character. Optional argument. String that will be appended to file output. Please provide a prefix if write is set to TRUE. |
population_col |
Numeric. Optional argument (a number) indicating the column that contains the population assignment information. |
individual_col |
Numeric. Optional argument (a number) indicating the column that contains the individuals (i.e., sample name) in the data. |
A list containing the estimated heterozygosity statistics. The per pop values are calculated by taking the average of the per locus estimates.
Keaka Farleigh
Expected (He) and observed heterozygosity (Ho):
Nei, M. (1987) Molecular Evolutionary Genetics. Columbia University Press
Homozygosity by locus (HL) and internal relatedness (IR):
Alho, J. S., Välimäki, K., & Merilä, J. (2010). Rhh: an R extension for estimating multilocus heterozygosity and heterozygosity–heterozygosity correlation. Molecular ecology resources, 10(4), 720-722.
Amos, W., Worthington Wilmer, J., Fullard, K., Burg, T. M., Croxall, J. P., Bloch, D., & Coulson, T. (2001). The influence of parental relatedness on reproductive success. Proceedings of the Royal Society of London. Series B: Biological Sciences, 268(1480), 2021-2027.doi:10.1098/rspb.2001.1751
Aparicio, J. M., Ortego, J., & Cordero, P. J. (2006). What should we weigh to estimate heterozygosity, alleles or loci?. Molecular Ecology, 15(14), 4659-4665.
Heterozygosity standardized by expected (Hs_exp) and observed heterozygosity (Hs_obs):
Coltman, D. W., Pilkington, J. G., Smith, J. A., & Pemberton, J. M. (1999). Parasite‐mediated selection against Inbred Soay sheep in a free‐living island populaton. Evolution, 53(4), 1259-1267.
data("HornedLizard_Pop") data("HornedLizard_VCF") Test <- Heterozygosity(data = HornedLizard_VCF, pops = HornedLizard_Pop, write = FALSE)
data("HornedLizard_Pop") data("HornedLizard_VCF") Test <- Heterozygosity(data = HornedLizard_VCF, pops = HornedLizard_Pop, write = FALSE)
Heterozygosity
and Differentiation
.Data frame containing 4 columns and 72 rows
data(HornedLizard_Pop)
data(HornedLizard_Pop)
A data frame with 4 columns and 72 rows:
Sample Name
Population assignment according to sNMF results (see citation)
Longitude
Latitude
...
Coordinates and population names taken from Farleigh, K., Vladimirova, S. A., Blair, C., Bracken, J. T., Koochekian, N., Schield, D. R., ... & Jezkova, T. (2021). The effects of climate and demographic history in shaping genomic variation across populations of the Desert Horned Lizard (Phrynosoma platyrhinos). Molecular Ecology, 30(18), 4481-4496.
data("HornedLizard_Pop") data("HornedLizard_VCF") Test <- Differentiation(data = HornedLizard_VCF, pops = HornedLizard_Pop, write = FALSE)
data("HornedLizard_Pop") data("HornedLizard_VCF") Test <- Differentiation(data = HornedLizard_VCF, pops = HornedLizard_Pop, write = FALSE)
Heterozygosity
and Differentiation
.Data frame containing 4 columns and 72 rows
data(HornedLizard_Pop)
data(HornedLizard_Pop)
A vcfR object
A vcfR object containing genotype and sample informaiton for 72 individuals.
...
Farleigh, K., Vladimirova, S. A., Blair, C., Bracken, J. T., Koochekian, N., Schield, D. R., ... & Jezkova, T. (2021). The effects of climate and demographic history in shaping genomic variation across populations of the Desert Horned Lizard (Phrynosoma platyrhinos). Molecular Ecology, 30(18), 4481-4496.
data("HornedLizard_Pop") data("HornedLizard_VCF") Test <- Heterozygosity(data = HornedLizard_VCF, pops = HornedLizard_Pop, write = FALSE)
data("HornedLizard_Pop") data("HornedLizard_VCF") Test <- Heterozygosity(data = HornedLizard_VCF, pops = HornedLizard_Pop, write = FALSE)
A function to map statistics (i.e., genetic differentiation) between points as a network on a map.
Network_map( dat, pops, neighbors, col, statistic = NULL, breaks = NULL, Lat_buffer = 1, Long_buffer = 1, Latitude_col = NULL, Longitude_col = NULL )
Network_map( dat, pops, neighbors, col, statistic = NULL, breaks = NULL, Lat_buffer = 1, Long_buffer = 1, Latitude_col = NULL, Longitude_col = NULL )
dat |
Data frame or character string that supplies the input data. If it is a character string, the file should be a csv. If it is a csv, the 1st row should contain the individual/population names. The columns should also be named in this fashion. |
pops |
Data frame or character string that supplies the input data. If it is a character string, the file should be a csv. The columns should be named Sample, containing the sample IDs; Population indicating the population assignment of the individual; Long, indicating the longitude of the sample; Lat, indicating the latitude of the sample. Alternatively, see the Longitude_col and Latitude_col arguments. |
neighbors |
Numeric or character. The number of neighbors to plot connections with, or the specific relationship that you want to visualize. Names should match those in the population assignment file and be seperated by an underscore. If I want to visualize the relationship between East and West, for example, I would set neighbors = "East_West". |
col |
Character vector indicating the colors you wish to use for plotting. |
statistic |
Character indicating the statistic being plotted. This will be used to title the legend. The legend title will be blank if left as NULL. |
breaks |
Numeric. The breaks used to generate the color ramp when plotting. Users should supply 3 values if custom breaks are desired. |
Lat_buffer |
Numeric. A buffer to customize visualization. |
Long_buffer |
Numeric. A buffer to customize visualization. |
Latitude_col |
Numeric. The number of the column indicating the latitude for each sample. If this is not null, PopGenHelpR will use this column instead of looking for the Lat column. |
Longitude_col |
Numeric. The number of the column indicating the longitude for each sample. If this is not null, PopGenHelpR will use this column instead of looking for the Long column. |
A list containing the map and the matrix used to plot the map.
Keaka Farleigh
data(Fst_dat) Fst <- Fst_dat[[1]] Loc <- Fst_dat[[2]] Test <- Network_map(dat = Fst, pops = Loc, neighbors = 2,col = c('#4575b4', '#91bfdb', '#e0f3f8','#fd8d3c','#fc4e2a'), statistic = "Fst", Lat_buffer = 1, Long_buffer = 1)
data(Fst_dat) Fst <- Fst_dat[[1]] Loc <- Fst_dat[[2]] Test <- Network_map(dat = Fst, pops = Loc, neighbors = 2,col = c('#4575b4', '#91bfdb', '#e0f3f8','#fd8d3c','#fc4e2a'), statistic = "Fst", Lat_buffer = 1, Long_buffer = 1)
A function to plot a heatmap from a symmetric matrix.
Pairwise_heatmap(dat, statistic, col = NULL)
Pairwise_heatmap(dat, statistic, col = NULL)
dat |
Data frame or character string that supplies the input data. If it is a character string, the file should be a csv. If it is a csv, the 1st row should contain the individual/population names. The columns should also be named in this fashion. |
statistic |
Character indicating the statistic represented in the matrix, this will be used to label the plot. |
col |
Character vector indicating the colors to be used in plotting. The vector should contain two colors, the first will be the low value, the second will be the high value. |
A heatmap plot
#' data(Fst_dat) Fst <- Fst_dat[[1]] Fstat_plot <- Pairwise_heatmap(dat = Fst, statistic = 'FST')
#' data(Fst_dat) Fst <- Fst_dat[[1]] Fstat_plot <- Pairwise_heatmap(dat = Fst, statistic = 'FST')
A function to perform principal component analysis (PCA) on genetic data. Loci with missing data will be removed prior to PCA.
PCA( data, center = TRUE, scale = FALSE, missing_value = NA, write = FALSE, prefix = NULL )
PCA( data, center = TRUE, scale = FALSE, missing_value = NA, write = FALSE, prefix = NULL )
data |
Character. String indicating the name of the vcf file, geno file or vcfR object to be used in the analysis. |
center |
Boolean. Whether or not to center the data before principal component analysis. |
scale |
Boolean. Whether or not to scale the data before principal component analysis. |
missing_value |
Character. String indicating missing data in the input data. It is assumed to be NA, but that may not be true (is likely not) in the case of geno files. |
write |
Boolean. Whether or not to write the output to files in the current working directory. There will be two files, one for the individual loadings and the other for the percent variance explained by each axis. |
prefix |
Character. Optional argument. String that will be appended to file output. Please provide a prefix if write is set to TRUE. |
A list containing two elements: the loadings of individuals on each principal component and the variance explained by each principal component.
Keaka Farleigh
data("HornedLizard_VCF") Test <- PCA(data = HornedLizard_VCF)
data("HornedLizard_VCF") Test <- PCA(data = HornedLizard_VCF)
Plot a map of ancestry pie charts.
Piechart_map( anc.mat, pops, K, plot.type = "all", col, piesize = 0.35, Lat_buffer, Long_buffer, Latitude_col = NULL, Longitude_col = NULL )
Piechart_map( anc.mat, pops, K, plot.type = "all", col, piesize = 0.35, Lat_buffer, Long_buffer, Latitude_col = NULL, Longitude_col = NULL )
anc.mat |
Data frame or character string that supplies the input data. If it is a character string, the file should be a csv. The first column should be the names of each sample/population, followed by the estimated contribution of each cluster to that individual/pop. |
pops |
Data frame or character string that supplies the input data. If it is a character string, the file should be a csv. The columns should be named Sample, containing the sample IDs; Population indicating the population assignment of the individual, population and sample names must be the same type (i.e., both numeric or both characters); Long, indicating the longitude of the sample; Lat, indicating the latitude of the sample. Alternatively, see the Longitude_col and Latitude_col arguments. |
K |
Numeric.The number of genetic clusters in your data set, please contact the package authors if you need help doing this. |
plot.type |
Character string. Options are all, individual, and population. All is default and recommended, this will plot a piechart map for both the individuals and populations. |
col |
Character vector indicating the colors you wish to use for plotting. |
piesize |
Numeric. The radius of the pie chart for ancestry mapping. |
Lat_buffer |
Numeric. A buffer to customize visualization. |
Long_buffer |
Numeric. A buffer to customize visualization. |
Latitude_col |
Numeric. The number of the column indicating the latitude for each sample. If this is not null, PopGenHelpR will use this column instead of looking for the Lat column. |
Longitude_col |
Numeric. The number of the column indicating the longitude for each sample. If this is not null, PopGenHelpR will use this column instead of looking for the Long column. |
A list containing your plots and the data frames used to generate the plots.
Keaka Farleigh
data(Q_dat) Qmat <- Q_dat[[1]] rownames(Qmat) <- Qmat[,1] Loc <- Q_dat[[2]] Test_all <- Piechart_map(anc.mat = Qmat, pops = Loc, K = 5, plot.type = 'all', col = c('#d73027', '#fc8d59', '#e0f3f8', '#91bfdb', '#4575b4'), piesize = 0.35, Lat_buffer = 1, Long_buffer = 1)
data(Q_dat) Qmat <- Q_dat[[1]] rownames(Qmat) <- Qmat[,1] Loc <- Q_dat[[2]] Test_all <- Piechart_map(anc.mat = Qmat, pops = Loc, K = 5, plot.type = 'all', col = c('#d73027', '#fc8d59', '#e0f3f8', '#91bfdb', '#4575b4'), piesize = 0.35, Lat_buffer = 1, Long_buffer = 1)
A function to plot coordinates on a map.
Plot_coordinates( dat, col = c("#A9A9A9", "#000000"), size = 3, Lat_buffer = 1, Long_buffer = 1, Latitude_col = NULL, Longitude_col = NULL )
Plot_coordinates( dat, col = c("#A9A9A9", "#000000"), size = 3, Lat_buffer = 1, Long_buffer = 1, Latitude_col = NULL, Longitude_col = NULL )
dat |
Data frame or character string that supplies the input data. If it is a character string, the file should be a csv. The coordinates of each row should be indicated by columns named Longitude and Latitude. Alternatively, see the Latitude_col and Longitude_col arugments. |
col |
Character vector indicating the colors you wish to use for plotting, two colors are allowed. The first color will be the fill color, the second is the outline color. For example, if I want red points with a black outline I would set col to col = c("#FF0000", "#000000"). |
size |
Numeric. The size of the points to plot. |
Lat_buffer |
Numeric. A buffer to customize visualization. |
Long_buffer |
Numeric. A buffer to customize visualization. |
Latitude_col |
Numeric. The number of the column indicating the latitude for each sample. If this is not null, PopGenHelpR will use this column instead of looking for the Latitude column. |
Longitude_col |
Numeric. The number of the column indicating the longitude for each sample. If this is not null, PopGenHelpR will use this column instead of looking for the Longitude column. |
A ggplot object.
Keaka Farleigh
data("HornedLizard_Pop") Test <- Plot_coordinates(HornedLizard_Pop)
data("HornedLizard_Pop") Test <- Plot_coordinates(HornedLizard_Pop)
A function to map statistics as colored points on a map.
Point_map( dat, statistic, size = 3, breaks = NULL, col, out.col = NULL, Lat_buffer = 1, Long_buffer = 1, Latitude_col = NULL, Longitude_col = NULL )
Point_map( dat, statistic, size = 3, breaks = NULL, col, out.col = NULL, Lat_buffer = 1, Long_buffer = 1, Latitude_col = NULL, Longitude_col = NULL )
dat |
Data frame or character string that supplies the input data. If it is a character string, the file should be a csv. The first column should be the statistic to be plotted. The coordinates of each row should be indicated by columns named Longitude and Latitude. Alternatively, see the Longitude_col and Latitude_col arguments. |
statistic |
Character string. The statistic to be plotted. |
size |
Numeric. The size of the points to plot. |
breaks |
Numeric. The breaks used to generate the color ramp when plotting. Users should supply 3 values if custom breaks are desired. |
col |
Character vector indicating the colors you wish to use for plotting, three colors are allowed (low, mid, high). The first color will be the low color, the second the middle, the third the high. |
out.col |
Character. A color for outlining points on the map. There will be no visible outline if left as NULL. |
Lat_buffer |
Numeric. A buffer to customize visualization. |
Long_buffer |
Numeric. A buffer to customize visualization. |
Latitude_col |
Numeric. The number of the column indicating the latitude for each sample. If this is not null, PopGenHelpR will use this column instead of looking for the Latitude column. |
Longitude_col |
Numeric. The number of the column indicating the longitude for each sample. If this is not null, PopGenHelpR will use this column instead of looking for the Longitude column. |
A list containing maps and the data frames used to generate them.
Keaka Farleigh
data(Het_dat) Test <- Point_map(Het_dat, statistic = "Heterozygosity")
data(Het_dat) Test <- Point_map(Het_dat, statistic = "Heterozygosity")
A function to estimate the number of private alleles in each population.
Private.alleles( data, pops, write = FALSE, prefix = NULL, population_col = NULL, individual_col = NULL )
Private.alleles( data, pops, write = FALSE, prefix = NULL, population_col = NULL, individual_col = NULL )
data |
Character. String indicating the name of the vcf file or vcfR object to be used in the analysis. |
pops |
Character. String indicating the name of the population assignment file or dataframe containing the population assignment information for each individual in the data. This file must be in the same order as the vcf file and include columns specifying the individual and the population that individual belongs to. The first column should contain individual names and the second column should indicate the population assignment of each individual. Alternatively, you can indicate the column containing the individual and population information using the individual_col and population_col arguments. |
write |
Boolean. Optional argument indicating Whether or not to write the output to a file in the current working directory. This will output to files; 1) the table of private allele counts per population (named prefix_PrivateAlleles_countperpop) and 2) metadata associated with the private alleles (named prefix_PrivateAlleles_metadata). Please supply a prefix it you write files to your working directory as a best practice. |
prefix |
Character. Optional argument indicating a string that will be appended to file output. Please set a prefix if write is TRUE. |
population_col |
Numeric. Optional argument (a number) indicating the column that contains the population assignment information. |
individual_col |
Numeric. Optional argument (a number) indicating the column that contains the individuals (i.e., sample name) in the data. |
A list containing the count of private alleles in each population and the metadata for those alleles. The metadata is a list that contains the private allele and locus name for each population.
Keaka Farleigh
data("HornedLizard_Pop") data("HornedLizard_VCF") Test <- Private.alleles(data = HornedLizard_VCF, pops = HornedLizard_Pop, write = FALSE)
data("HornedLizard_Pop") data("HornedLizard_VCF") Test <- Private.alleles(data = HornedLizard_VCF, pops = HornedLizard_Pop, write = FALSE)
List with two elements
data(Q_dat)
data(Q_dat)
A list with two elements:
A q-matrix with 6 columns and 30 rows, the first column lists the sample name and the remaining 5 represent the contribution a genetic cluster to that individuals ancestry
The locality information for each individual in the q-matrix
...
Data was generated by package authors.
data(Q_dat) Qmat <- Q_dat[[1]] rownames(Qmat) <- Qmat[,1] Loc <- Q_dat[[2]] Test_all <- Ancestry_barchart(anc.mat = Qmat, pops = Loc, K = 5, plot.type = 'all',col = c('#d73027', '#fc8d59', '#e0f3f8', '#91bfdb', '#4575b4'))
data(Q_dat) Qmat <- Q_dat[[1]] rownames(Qmat) <- Qmat[,1] Loc <- Q_dat[[2]] Test_all <- Ancestry_barchart(anc.mat = Qmat, pops = Loc, K = 5, plot.type = 'all',col = c('#d73027', '#fc8d59', '#e0f3f8', '#91bfdb', '#4575b4'))