[RNE Rewrite] Add text and image embeddings pipelines by msluszniak · Pull Request #1292 · software-mansion/react-native-executorch

msluszniak · 2026-06-30T10:56:04Z

Description

Adds text and image embeddings pipelines to the new architecture, achieving parity with the old flow. Embeddings are pure-TypeScript tasks (pooling + L2-norm stay baked into the .pte): text tokenizes and runs forward; image reuses the existing image preprocessor. To run the existing int64-input embedding models unchanged, this adds an int64/Long tensor dtype to the core (the tensor data path is byte-oriented, so it is a small dtype.{h,cpp} + tensor.ts change).

Text inputs are fed at their exact token length (no padding). model.execute validates dynamically-shaped forward inputs against the [min, max, step] bounds exposed by an optional get_dynamic_dims method; models without it keep exact per-dimension validation. This fixes scale-sensitive pooling heads (e.g. DistilUSE's tanh projection), which padding otherwise corrupts.

Includes createTextEmbeddings / createImageEmbeddings tasks, useTextEmbeddings / useImageEmbeddings hooks, models.textEmbeddings / models.imageEmbeddings registry entries, an interactive text-embeddings demo in apps/nlp, and a CLIP zero-shot image-embeddings demo in apps/computer-vision.

Introduces a breaking change?

Yes
No

Type of change

Bug fix (change which fixes an issue)
New feature (change which adds functionality)
Documentation update (improves or adds clarity to existing documentation)
Other (chores, tests, code style improvements etc.)

Tested on

iOS
Android

Testing instructions

nlp app → Text Embeddings: seeds a sentence library; type a query and Find similar to rank by cosine similarity, switch models via the chips. Verified on a physical Android device (arm64): all-MiniLM-L6-v2 returns 384-dim L2-normalized embeddings (~25 ms/forward on XNNPACK); DistilUSE ranks correctly with a wide similarity spread (previously compressed by padding).
computer-vision app → Image Embeddings: pick an image and rank editable text labels via CLIP zero-shot (image vs. text embeddings). Verified on device.

Screenshots

Related issues

#1247

Checklist

I have performed a self-review of my code
I have commented my code, particularly in hard-to-understand areas
I have updated the documentation accordingly
My changes generate no new warnings

Additional notes

DistilUSE and CLIP (text) are re-exported with the get_dynamic_dims method and pinned to v0.10.0; the remaining text-embedding models (all-MiniLM-L6-v2, all-mpnet-base-v2, multi-qa MiniLM/MPNet, paraphrase-ML) still need re-export to v0.10.0.

Add int64/Long tensor dtype support and text/image embeddings tasks, hooks, and model registry entries, plus an interactive text-embeddings demo screen in apps/nlp. Closes #1247

model.execute now validates dynamically-shaped forward inputs against the model-declared [min, max, step] bounds exposed by an optional get_dynamic_dims method, instead of requiring an exact shape match; models without it keep exact per-dimension validation. Text embeddings feed the exact token length with no padding, which fixes scale-sensitive pooling heads (e.g. DistilUSE's tanh projection). Point DistilUSE at v0.10.0 (re-exported with get_dynamic_dims).

…mbeddings demo - Simplify text-embeddings cosine to a dot product (all models L2-normalize) and drop redundant inline comments. - Move the get_dynamic_dims / input-validation contract into the ModelHostObject class docs; trim the inline narration in model.cpp. - Add an Image Embeddings example to the computer-vision app: pick two images and compare their CLIP embeddings by cosine similarity.

Rework the computer-vision Image Embeddings screen (based on main's CLIP demo): pick an image and rank editable text labels by CLIP image/text embedding similarity, instead of the uninformative two-image score. Pads the scroll content past the Android nav bar. Point CLIP text + image at v0.10.0 (text re-exported with get_dynamic_dims; image unchanged) and declare the textEmbeddings feature in the app.

- model.{h,cpp}: read get_dynamic_dims once per model and cache it instead of re-executing the method on every forward() call; reject a present-but- malformed declaration (wrong dtype/rank/shape, bad min/max/step, or row count not matching forward's tensor input dims) with an explicit error instead of silently falling back to exact validation. - textEmbeddings: throw a clear error when input tokenizes to zero tokens (was BigInt(undefined)); fix docstring to match no-padding behavior. - useTextEmbeddings: expose localPath/tokenizerPath like sibling hooks. - computer-vision: extract shared skImageToBuffer helper, dedup from classification and imageEmbeddings screens.

- Use unordered_set::contains instead of count()==0 (readability-container-contains). - Keep the new dynamic-bounds cache members public so the class stays an all-public data carrier; adding private member variables had broken the non-private-member exemption and flagged the existing public members.

[RNE Rewrite] Add text and image embeddings pipelines

bf6bc01

Add int64/Long tensor dtype support and text/image embeddings tasks, hooks, and model registry entries, plus an interactive text-embeddings demo screen in apps/nlp. Closes #1247

msluszniak self-assigned this Jun 30, 2026

msluszniak added the refactoring label Jun 30, 2026

msluszniak linked an issue Jun 30, 2026 that may be closed by this pull request

[RNE Rewrite] Add image and text embeddings pipelines #1247

Open

msluszniak added the feature PRs that implement a new feature label Jun 30, 2026

msluszniak commented Jul 1, 2026

View reviewed changes

msluszniak added 5 commits July 1, 2026 13:28

fix(clang-tidy): use auto for const_data_ptr result (modernize-use-auto)

18547ee

msluszniak marked this pull request as ready for review July 1, 2026 16:07

msluszniak requested a review from barhanc July 1, 2026 16:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[RNE Rewrite] Add text and image embeddings pipelines#1292

[RNE Rewrite] Add text and image embeddings pipelines#1292
msluszniak wants to merge 7 commits into
rne-rewritefrom
@ms/add-embeddings

msluszniak commented Jun 30, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

msluszniak commented Jun 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Introduces a breaking change?

Type of change

Tested on

Testing instructions

Screenshots

Related issues

Checklist

Additional notes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

msluszniak commented Jun 30, 2026 •

edited

Loading