Flinders University
Browse
- No file added yet -

Bacillus Carbohydrate Metabolism Protvec model

Download (3.35 MB)
Version 4 2022-05-21, 01:59
Version 3 2022-05-16, 05:20
Version 2 2022-05-16, 04:59
Version 1 2022-05-16, 04:58
dataset
posted on 2022-05-21, 01:59 authored by Susie GrigsonSusie Grigson, Jody C. McKerral, James G Mitchell, Robert EdwardsRobert Edwards

Protvec model  trained using 8,743 sequences from the Genome Taxonomy Database (GTDB). Sequences were filtered to remove sequences containing 'X', sequences shorter than 30 amino acids and sequences longer than 1024 amino acids. 

 

Training used a vector size of 100 and a context size of 25 to produce a dictionary object containing a 100-dimensional vector for each 3-mer present in the training data. 


Model is stored as a .pkl file which can be imported using the Python pickle module.

History

Primary contact

susie.grigson@flinders.edu.au

Usage metrics

    Flinders University

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC