prompt_tokenizers
prompt_tokenizers
Module containing PromptTokenizingStrategy and Prompter classes
Classes
| Name | Description |
|---|---|
| AlpacaMultipleChoicePromptTokenizingStrategy | Tokenizing strategy for Alpaca Multiple Choice prompts. |
| AlpacaPromptTokenizingStrategy | Tokenizing strategy for Alpaca prompts. |
| AlpacaReflectionPTStrategy | Tokenizing strategy for Alpaca Reflection prompts. |
| DatasetWrappingStrategy | Abstract class for wrapping datasets for Chat Messages |
| GPTeacherPromptTokenizingStrategy | Tokenizing strategy for GPTeacher prompts. |
| InstructionPromptTokenizingStrategy | Tokenizing strategy for instruction-based prompts. |
| InvalidDataException | Exception raised when the data is invalid |
| JeopardyPromptTokenizingStrategy | Tokenizing strategy for Jeopardy prompts. |
| NomicGPT4AllPromptTokenizingStrategy | Tokenizing strategy for NomicGPT4All prompts. |
| OpenAssistantPromptTokenizingStrategy | Tokenizing strategy for OpenAssistant prompts. |
| PromptTokenizingStrategy | Abstract class for tokenizing strategies |
| ReflectionPromptTokenizingStrategy | Tokenizing strategy for Reflection prompts. |
| SummarizeTLDRPromptTokenizingStrategy | Tokenizing strategy for SummarizeTLDR prompts. |
AlpacaMultipleChoicePromptTokenizingStrategy
prompt_tokenizers.AlpacaMultipleChoicePromptTokenizingStrategy(
prompter,
tokenizer,
train_on_inputs=False,
sequence_len=2048,
)Tokenizing strategy for Alpaca Multiple Choice prompts.
AlpacaPromptTokenizingStrategy
prompt_tokenizers.AlpacaPromptTokenizingStrategy(
prompter,
tokenizer,
train_on_inputs=False,
sequence_len=2048,
)Tokenizing strategy for Alpaca prompts.
AlpacaReflectionPTStrategy
prompt_tokenizers.AlpacaReflectionPTStrategy(
prompter,
tokenizer,
train_on_inputs=False,
sequence_len=2048,
)Tokenizing strategy for Alpaca Reflection prompts.
DatasetWrappingStrategy
prompt_tokenizers.DatasetWrappingStrategy()Abstract class for wrapping datasets for Chat Messages
GPTeacherPromptTokenizingStrategy
prompt_tokenizers.GPTeacherPromptTokenizingStrategy(
prompter,
tokenizer,
train_on_inputs=False,
sequence_len=2048,
)Tokenizing strategy for GPTeacher prompts.
InstructionPromptTokenizingStrategy
prompt_tokenizers.InstructionPromptTokenizingStrategy(
prompter,
tokenizer,
train_on_inputs=False,
sequence_len=2048,
)Tokenizing strategy for instruction-based prompts.
InvalidDataException
prompt_tokenizers.InvalidDataException()Exception raised when the data is invalid
JeopardyPromptTokenizingStrategy
prompt_tokenizers.JeopardyPromptTokenizingStrategy(
prompter,
tokenizer,
train_on_inputs=False,
sequence_len=2048,
)Tokenizing strategy for Jeopardy prompts.
NomicGPT4AllPromptTokenizingStrategy
prompt_tokenizers.NomicGPT4AllPromptTokenizingStrategy(
prompter,
tokenizer,
train_on_inputs=False,
sequence_len=2048,
)Tokenizing strategy for NomicGPT4All prompts.
OpenAssistantPromptTokenizingStrategy
prompt_tokenizers.OpenAssistantPromptTokenizingStrategy(
prompter,
tokenizer,
train_on_inputs=False,
sequence_len=2048,
)Tokenizing strategy for OpenAssistant prompts.
PromptTokenizingStrategy
prompt_tokenizers.PromptTokenizingStrategy(
prompter,
tokenizer,
train_on_inputs=False,
sequence_len=2048,
)Abstract class for tokenizing strategies
ReflectionPromptTokenizingStrategy
prompt_tokenizers.ReflectionPromptTokenizingStrategy(
prompter,
tokenizer,
train_on_inputs=False,
sequence_len=2048,
)Tokenizing strategy for Reflection prompts.
SummarizeTLDRPromptTokenizingStrategy
prompt_tokenizers.SummarizeTLDRPromptTokenizingStrategy(
prompter,
tokenizer,
train_on_inputs=False,
sequence_len=2048,
)Tokenizing strategy for SummarizeTLDR prompts.
Functions
| Name | Description |
|---|---|
| parse_tokenized_to_result | Parses the tokenized prompt and append the tokenized input_ids, attention_mask and labels to the result |
| tokenize_prompt_default | Returns the default values for the tokenize prompt function |
parse_tokenized_to_result
prompt_tokenizers.parse_tokenized_to_result(
result,
current_len,
res,
labels,
pad_token_id=None,
)Parses the tokenized prompt and append the tokenized input_ids, attention_mask and labels to the result
tokenize_prompt_default
prompt_tokenizers.tokenize_prompt_default()Returns the default values for the tokenize prompt function