Skip to content

Fixed a bug in FastWordpieceTokenizer where vocab sizes >= 7 would cause failures if the unknown token was not at the end of the vocabulary by ensuring the internal hash map is reserved upfront.#1470

Open
copybara-service[bot] wants to merge 1 commit intomasterfrom
test_870823136

Conversation

@copybara-service
Copy link
Contributor

@copybara-service copybara-service bot commented Feb 16, 2026

Fixed a bug in FastWordpieceTokenizer where vocab sizes >= 7 would cause failures if the unknown token was not at the end of the vocabulary by ensuring the internal hash map is reserved upfront.
Also fixed internal ClangTidy include cleaner errors in string_vocab.cc.

@copybara-service copybara-service bot force-pushed the test_870823136 branch 2 times, most recently from b603f19 to 8666d4a Compare February 16, 2026 13:46
…use failures if the unknown token was not at the end of the vocabulary by ensuring the internal hash map is reserved upfront.

Also fixed internal ClangTidy include cleaner errors in string_vocab.cc.

PiperOrigin-RevId: 870823136
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant