Port of Andrej Karpathy's minbpe to Rust.
Create a Rust application crate with cargo,
$> cargo new minbpe-test
In the resulting project, add minbpe to Cargo.toml,
[dependencies]
minbpe = "0.1.0"Refer crates.io for selecting the latest version. Next in src/main.rs,
use std::path::Path;
use minbpe::{BasicTokenizer, Saveable, Tokenizer, Trainable};
fn main() {
let text = "aaabdaaabac" ;
let mut tokenizer = BasicTokenizer::new() ;
tokenizer.train( text , 256 + 3 , false ) ;
println!( "{:?}" , tokenizer.encode(text) ) ;
println!( "{:?}" , tokenizer.decode( &[258, 100, 258, 97, 99] ) ) ;
tokenizer.save( Path::new( "./" ) , "toy" ) ;
}Execute the binary with cargo run,
$> cargo run
...
Compiling minbpe-test v0.1.0 (~/minbpe-test)
Finished dev [unoptimized + debuginfo] target(s) in 15.71s
Running `target/debug/minbpe-test`
[258, 100, 258, 97, 99]
"aaabdaaabac"
Licensed under either of
- Apache License, Version 2.0 (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
- MIT license (LICENSE-MIT or http://opensource.org/licenses/MIT)
at your option.
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.