Metadata-Version: 2.4
Name: kcounter
Version: 0.1.1
Keywords: kmer,bioinformatics
Home-Page: https://github.com/apcamargo/kcounter
Author: Antonio Camargo <antoniop.camargo@gmail.com>
Author-email: Antonio Camargo <antoniop.camargo@gmail.com>
Description-Content-Type: text/markdown; charset=UTF-8; variant=GFM

# kcounter

[![PyPI](https://img.shields.io/pypi/v/kcounter.svg?label=PyPI&color=green)](https://pypi.python.org/pypi/kcounter)
![GitHub Workflow Status](https://img.shields.io/github/workflow/status/apcamargo/kcounter/kcounter%20workflow?label=Build%20%26%20test&logo=github)

A simple package for counting DNA k-mers in Python. Written in Rust.

## Instalation

There are two ways to install `kcounter`:

- Using pip:

```
pip install kcounter
```

- Using conda:

```
conda install -c conda-forge -c bioconda kcounter
```

## Usage

Currently, `kcounter` provides a single function, `count_kmers`, that returns a dictionary containing the k-mers of the chosen size.

```python
>>> import kcounter
>>> kcounter.count_kmers('AAACTTTTTT', 3)
{'AAA': 1.0, 'ACT': 1.0, 'AAC': 1.0, 'CTT': 1.0, 'TTT': 4.0}
>>> kcounter.count_kmers('AAACTTTTTT', 4)
{'AACT': 1.0, 'CTTT': 1.0, 'ACTT': 1.0, 'AAAC': 1.0, 'TTTT': 3.0}
```

The `relative_frequencies` parameter can be used to obtain relative k-mer frequencies:

```python
>>> kcounter.count_kmers('AAACTTTTTT', 3, relative_frequencies=True)
{'AAC': 0.125, 'TTT': 0.5, 'CTT': 0.125, 'ACT': 0.125, 'AAA': 0.125}
```

The `canonical_kmers` parameters aggregates the counts of reverse-complement k-mers (eg.: AGC/GCT):

```python
>>> kcounter.count_kmers('AAACTTTTTT', 3, canonical_kmers=True)
{'ACT': 1.0, 'AAA': 5.0, 'AAC': 1.0, 'AAG': 1.0}
```

## Plans for future versions:

- Performance improvements.
- Add an parameter that makes the function return a sparse k-mer counts.
- Implement a function that returns a numpy array.

