Calculates and plots kmers distributions and frequencies.

VDJ_kmers(
  VDJ,
  sequence.column,
  grouping.column,
  kmer.k,
  max.kmers,
  specific.kmers,
  plot.format,
  as.proportions
)

Arguments

VDJ

VDJ dataframe output from the VDJ_GEX_matrix function.

sequence.column

Character vector. One or more sequence column names from the VDJ for kmer counting. if more than one column is provided (e.g. c("VDJ_cdr3s_aa","VJ_cdr3s_aa")) these columns will be pasted together before counting the kmers.

grouping.column

Character. Column name of a column to group kmer counting by. This could be "sample_id" to group each kmer by the sample.

kmer.k

Integer. Length k of each kmer.

max.kmers

Integer. Maximum number of kmers to be plotted in the output barplots.

specific.kmers

Character vector. Specific kmers to be plotted in the output barplots.

plot.format

Character. The output plot format: 'barplot' for barplots of kmer frequency per group, 'pca' for group-level PCA reduction across the kmer vectors, 'density' for kmer count density plots.

as.proportions

Boolean. If TRUE, will return the kmer barplot as proportions instead of absolute counts.

Value

Returns a ggplot with the kmer analysis depedning on the plot.format parameter

Examples

if (FALSE) {
#Calculate the 3-kmer frequency for CDRH3s and plot the 20 most abundant kmers.
 VDJ_kmers(VDJ = Platypus::small_vgm[[1]],
,sequence.columns = c("VDJ_cdr3s_aa"), grouping.column = "sample_id", kmer.k = 3, max.kmers = 20)
}