Module commands::tokenizer
[−]
[src]
Command Tokenization
The command parser needs to be able to tokenize commands into their constituent words and whitespace.
The tokenizer
breaks source text into a vector of tokens
which can be either whitespace or a word. The tokenizer
handles using single and double quotes to provide a single
token which may include whitespace.
Tokens also track their source location within the source text. This allows the parser using the tokenizer to provide better error highlighting and other functionality.
Examples
use commands::tokenizer::{tokenize, TokenType}; if let Ok(tokens) = tokenize("word") { assert_eq!(tokens.len(), 1); } // This is 3 tokens due to the whitespace token // between the 2 words. if let Ok(tokens) = tokenize("show interface") { assert_eq!(tokens.len(), 3); assert_eq!(tokens[1].token_type, TokenType::Whitespace); } // Double quoted strings are treated as a single token. if let Ok(tokens) = tokenize(r#"echo -n "a b c""#) { assert_eq!(tokens.len(), 5); assert_eq!(tokens[0].text, "echo"); assert_eq!(tokens[2].text, "-n"); assert_eq!(tokens[4].text, r#""a b c""#); } // Single quoted strings are treated as a single token // as well. if let Ok(tokens) = tokenize(r#"'"One token"' 'and another'"#) { assert_eq!(tokens.len(), 3); } // Or you can use a \ to escape a space. if let Ok(tokens) = tokenize(r#"ls My\ Documents"#) { assert_eq!(tokens.len(), 3); assert_eq!(tokens[2].text, r#"My\ Documents"#); }
Structs
SourceLocation |
A range within a body of text. |
SourceOffset |
A position within a body of text. |
Token |
A token from a body of text. |
Enums
TokenType |
The role that a token plays: |
TokenizerError |
Errors |
Functions
tokenize |
Tokenize a body of text. |