Conversation
|
Review requested:
|
|
The
notable-change
Please suggest a text for the release notes if you'd like to include a more detailed summary, then proceed to update the PR description with the text or a link to the notable change suggested text comment. Otherwise, the commit will be placed in the Other Notable Changes section. |
|
Nice! This is a great addition. Since it's such a large PR, this will take me some time to review. Will try to tackle it over the next week. |
There was a problem hiding this comment.
Shouldn't we import this at the top level / lazy load it at the top level?
There was a problem hiding this comment.
| restore: restoreFS, | |
| restore: ctx.restore, |
nit
lib/internal/vfs/entries.js
Outdated
lib/internal/vfs/virtual_fs.js
Outdated
lib/internal/vfs/virtual_fs.js
Outdated
lib/internal/vfs/virtual_fs.js
Outdated
There was a problem hiding this comment.
| }).catch((err) => { | |
| }, (err) => { |
lib/internal/vfs/virtual_fs.js
Outdated
lib/internal/vfs/virtual_fs.js
Outdated
There was a problem hiding this comment.
| }).catch((err) => { | |
| }, (err) => { |
|
Left an initial review, but like @Ethan-Arrowood said, it'll take time for a more in depth look |
|
It's nice to see some momentum in this area, though from a first glance it seems the design has largely overlooked the feedback from real world use cases collected 4 years ago: https://github.com/nodejs/single-executable/blob/main/docs/virtual-file-system-requirements.md - I think it's worth checking that the API satisfies the constraints that users of this feature have provided, to not waste the work that have been done by prior contributors to gather them, or having to reinvent it later (possibly in a breaking manner) to satisfy these requirements from real world use cases. |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #61478 +/- ##
==========================================
- Coverage 89.74% 89.65% -0.09%
==========================================
Files 675 690 +15
Lines 204601 210868 +6267
Branches 39325 40188 +863
==========================================
+ Hits 183616 189056 +5440
- Misses 13273 14081 +808
- Partials 7712 7731 +19
🚀 New features to boost your workflow:
|
|
And why not something like OPFS aka whatwg/fs? const rootHandle = await navigator.storage.getDirectory()
await rootHandle.getFileHandle('config.json', { create: true })
fs.mount('/app', rootHandle) // to make it work with fs
fs.readFileSync('/app/config.json')OR const rootHandle = await navigator.storage.getDirectory()
await rootHandle.getFileHandle('config.json', { create: true })
fs.readFileSync('sandbox:/config.json')fs.createVirtual seems like something like a competing specification |
5e317de to
977cc3d
Compare
I generally prefer not to interleave with WHATWG specs as much as possible for core functionality (e.g., SEA). In my experience, they tend to perform poorly on our codebase and remove a few degrees of flexibility. (I also don't find much fun in working on them, and I'm way less interested in contributing to that.) On an implementation side, the core functionality of this feature will be identical (technically, it's missing writes that OPFS supports), as we would need to impact all our internal fs methods anyway. If this lands, we can certainly iterate on a WHATWG-compatible API for this, but I would not add this to this PR. |
|
Small prior art: https://github.com/juliangruber/subfs |
8d711c1 to
73c18cd
Compare
|
I also worked on this a bit on the side recently: Qard@73b8fc6 That is very much in chaotic ideation stage with a bunch of LLM assistance to try some different ideas, but the broader concept I was aiming for was to have a module.exports = new VirtualFileSystem(new LocalProvider())I intended for it to be extensible for a bunch of different interesting scenarios, so there's also an S3 provider and a zip file provider there, mainly just to validate that the model can be applied to other varieties of storage systems effectively. Keep in mind, like I said, the current state is very much just ideation in a branch I pushed up just now to share, but I think there are concepts for extensibility in there that we could consider to enable a whole ecosystem of flexible storage providers. 🙂 Personally, I would hope for something which could provide both read and write access through an abstraction with swappable backends of some variety, this way we could pass around these virtualized file systems like objects and let an ecosystem grow around accepting any generalized virtual file system for its storage backing. I think it'd be very nice for a lot of use cases like file uploads or archive management to be able to just treat them like any other readable and writable file system. |
just a bit off topic... but this reminds me of why i created this feature request: Would not lie, it would be cool if NodeJS also provided some type of static example that would only work in NodeJS (based on how it works internally) const size = 26
const blobPart = BlobFrom({
size,
stream (start, end) {
// can either be sync or async (that resolves to a ReadableStream)
// return new Response('abcdefghijklmnopqrstuvwxyz'.slice(start, end)).body
// return new Blob(['abcdefghijklmnopqrstuvwxyz'.slice(start, end)]).stream()
return fetch('https://httpbin.dev/range/' + size, {
headers: {
range: `bytes=${start}-${end - 1}`
}
}).then(r => r.body)
}
})
blobPart.text().then(text => {
console.log('a-z', text)
})
blobPart.slice(-3).text().then(text => {
console.log('x-z', text)
})
const a = blobPart.slice(0, 6)
a.text().then(text => {
console.log('a-f', text)
})
const b = a.slice(2, 4)
b.text().then(text => {
console.log('c-d', text)
})An actual working PoC(I would not rely on this unless it became officially supported by nodejs core - this is a hack) const blob = new Blob()
const symbols = Object.getOwnPropertySymbols(blob)
const blobSymbol = symbols.map(s => [s.description, s])
const symbolMap = Object.fromEntries(blobSymbol)
const {
kHandle,
kLength,
} = symbolMap
function BlobFrom ({ size, stream }) {
const blob = new Blob()
if (size === 0) return blob
blob[kLength] = size
blob[kHandle] = {
span: [0, size],
getReader () {
const [start, end] = this.span
if (start === end) {
return { pull: cb => cb(0) }
}
let reader
return {
async pull (cb) {
reader ??= (await stream(start, end)).getReader()
const {done, value} = await reader.read()
cb(done ^ 1, value)
}
}
},
slice (start, end) {
const [baseStart] = this.span
return {
span: [baseStart + start, baseStart + end],
getReader: this.getReader,
slice: this.slice,
}
}
}
return blob
}currently problematic to do: also need to handle properly clone, serialize & deserialize, if this where to be sent of to another worker - then i would transfer a MessageChannel where the worker thread asks main frame to hand back a transferable ReadableStream when it needs to read something. but there are probably better ways to handle this internally in core with piping data directly to and from different destinations without having to touch the js runtime? - if only getReader could return the reader directly instead of needing to read from the ReadableStream using js? |
|
I'm not a huge fan of the My thinking was something like: function someApi(fs: FsLike) {
fs.readFile(...)
}
// Works with just the base fs provider itself.
someApi(require('fs'))
// Also works with any provider that conforms to the FsLike contract.
// Because mounting exists on the provider rather than the actual vfs
// additional mounts can't be set from elsewhere without access to the
// original provider.
const mockOverlay = new OverlayProvider(require('fs'))
mockOverlay.mount('/mount', new MockProvider(...))
someApi(new VirtualFileSystem(mockOverlay)) |
|
That hijack is what makes it viable for the ecosystem without changing their current APIs. |
|
Perhaps, but it's also somewhat risky. Perhaps we need to think about how we can apply the permission model controls to that to prevent unintended tampering with the global fs? |
Use wrapModuleLoad instead of Module._load directly to ensure diagnostics channels and performance tracing work properly when loading modules from SEA VFS.
- Add "Limitations" section to vfs.md documenting unsupported features: native addons, child processes, worker threads, fs.watch polling, and SEA code caching - Add code caching limitations section to SEA VFS documentation - Add VFS support section to fs.md with example and link to vfs.md
Use POSIX path normalization for VFS paths starting with '/' to preserve forward slashes on Windows. The platform's path.normalize() converts forward slashes to backslashes on Windows, breaking VFS path matching. Add normalizeVFSPath() helper that uses path.posix.normalize() for Unix-style paths and path.normalize() for Windows drive letter paths. Fixes test-vfs-chdir, test-vfs-real-provider, test-vfs-mount-events, and test-vfs-watch on Windows.
lib/internal/main/embedding.js
Outdated
There was a problem hiding this comment.
I don't think these are necessary if there is a hook, the module loading should take care of that during resolution, otherwise the hooks will see wrong values coming in.
Also this should be an option in the SEA config IMO, if the user already bundles their code for performance, there's no good reason to add the overhead in.
There was a problem hiding this comment.
to clarify: would you prefer to have a SEAProvider that users can add if they want to?
There was a problem hiding this comment.
I was thinking about a field in the sea config JSON, so that this behaviour isn't installed if the field is false (or the field can be more sophisticated than a Boolean for future extensions)
lib/internal/main/embedding.js
Outdated
There was a problem hiding this comment.
I think this would only work for top level modules but not inner ones? Is that intentional?
There was a problem hiding this comment.
This isn't accurate: code caching doesn't require execution, it's the discovery of CJS modules that requires execution. The reason why it's not supported isn't because it's technically impossible, but because this PR doesn't take care of the discovery of modules within the VFS during SEA building and build caches for these modules. But this documentation is alluding to the idea that it's technically not possible instead of just being an incomplete implementation.
There was a problem hiding this comment.
| `useCodeCache` for your main entry point and `useSnapshot` for initialization. | |
| bundling the application to enable code caching and do not rely on module loading in VFS. |
- Make SEA VFS opt-in via `"useVfs": true` config field with
corresponding `kEnableVfs` flag in C++ and `isVfsEnabled()` binding
- Replace manual VFS path resolution in embedderRequire with
wrapModuleLoad() that flows through registered module hooks,
supporting transitive requires
- Set main module filename to VFS path when VFS is active so
relative require('./foo.js') resolves correctly
- Convert _-prefixed methods to # private in file_system.js
- Fix inaccurate code caching docs per reviewer suggestion
- Replace supported sync method list with unsupported method list
- Add native addon limitation note for VFS in SEA docs
Use require('./modules/math.js') instead of require('/sea/modules/math.js')
in the main SEA script to verify that relative requires work from the
entry point, since __filename is now set to a VFS path.
Fix clang-format issue in FPrintF call for useVfs config parsing. Fix markdown lint warning for unordered reference link.
|
|
||
| #### Code caching limitations | ||
|
|
||
| The `useCodeCache` option in the SEA configuration does not apply to modules |
There was a problem hiding this comment.
I think both should clarify that this is a current limitation from incomplete implementation, not that they are technically impossible (lest it stops people from trying to implement it, like what happened with the "ESM is async so it's not supported by require()" statement once in the doc)
|
|
||
| function embedderRequire(id) { | ||
| return loadBuiltinModuleForEmbedder(id).exports; | ||
| const normalizedId = normalizeRequirableId(id); |
There was a problem hiding this comment.
Actually I think when VFS is enabled, we should just use createRequire? Otherwise builtin loading does not flow through the hooks, and we still have that awkward "require doesn't have normal properties except main" in the doc for no good reason...
| args.GetReturnValue().Set(IsSingleExecutable()); | ||
| } | ||
|
|
||
| void IsVfsEnabled(const FunctionCallbackInfo<Value>& args) { |
There was a problem hiding this comment.
N/B: all these Boolean getters that are always called in one place seems a bit wasteful, they could be tidy up into one call (preciously there were two, and now three is a charm?)
|
|
||
| > Stability: 1 - Experimental | ||
|
|
||
| Instead of using the `node:sea` API to access individual assets, you can use |
There was a problem hiding this comment.
| Instead of using the `node:sea` API to access individual assets, you can use | |
| In addition to using the `node:sea` API to access individual assets, you can use |
I think the sea API would continue to work even with VFS (lest there are things people prefer to load more directly for perf, since the API provides zero-copy getters).
Also could you add a test to check that they can load the same thing?
| } | ||
| ``` | ||
|
|
||
| The VFS supports the following `fs` operations on bundled assets: |
There was a problem hiding this comment.
Are they the same as normal VFS? If so better to just link instead of repeating the list again? Or it may be better to list what is not supported instead.
|
|
||
| #### Loading modules from VFS in SEA | ||
|
|
||
| You can use `require()` directly with absolute VFS paths: |
There was a problem hiding this comment.
I think the "when useVfs is enabled" precondition should be moved here.
| * Unregisters a VFS instance. | ||
| * @param {VirtualFileSystem} vfs The VFS instance to unregister | ||
| */ | ||
| function unregisterVFS(vfs) { |
There was a problem hiding this comment.
In the public API the terminology is deregister
|
|
||
| // Override fs.readFileSync | ||
| // We need to be careful to only intercept when VFS should handle the path | ||
| fs.readFileSync = function readFileSync(path, options) { |
There was a problem hiding this comment.
Why does it need to patch the fs here? I think overriding load is enough?
| }; | ||
|
|
||
| // Override fs.realpathSync | ||
| fs.realpathSync = function realpathSync(path, options) { |
There was a problem hiding this comment.
Similarly - why are we patching these methods? I think we can accompolish this by changing the internal paths used by resolution to read from vfs instead of fs, and make that behavior toggle-able here (e.g. instead of calling realpathSync directly in
node/lib/internal/modules/esm/resolve.js
Line 279 in e8c9c43
realpathSync, or the vfs-aware one. When vfs is enabled, we just toggle them to return the vfs-aware one. That way we can avoid the mismatches coming from another layer of patching.
Also since the module loader caches realpathSync and a bunch of fs methods, this patching does not work if resolve.js happen to load before the hooks are installed. That can be very brittle and stops us from optimizing the loading of the module loader. Using a wrapper directly in the implementations would make this problem go away.
| console.log('relative require from main script passed'); | ||
|
|
||
| // Test transitive requires: calculator.js requires ./math.js internally | ||
| const calculator = require('./modules/calculator.js'); |
There was a problem hiding this comment.
Can you also add a test for package lookups? (in that case, the files need to be placed in node_modules in the vfs)
| throw createENOENT('open', path); | ||
| } | ||
| const content = this[kGetAsset](key); | ||
| return Buffer.from(content); |
There was a problem hiding this comment.
This only creates a view of the raw buffer, and if I am reading it correctly, doesn't create a copy in the callers. I think this needs a call out in the documentation that when using VFS in SEA this way, the buffers returned by e.g. readFileSync must not be mutated. That is, if you do const buffer = fs.readFileSync('/sea/asset'); buffer[0] = 1 the process will crash (because this will then try to write into read-only segments)
Another alternative is to make the copying behavior a toggle in the SEA config, so that users can choose between always getting a copy for safety, or always avoiding a copy for perf and they will wrap it with a copy when they do need to mutate the buffer. It probably makes more sense to leave the copying behavior by default, because when applications do crash for trying to write into a buffer that gets passed around, it might be rather difficult for users to track down where that buffer wrapping over read-only segment comes from, or how that end up being written into by other code. The no copy behavior is better left as an optimization for people who know what they are doing.
|
Since the SEA provider seems to attempt to allow write into read-only segments that can lead to a crash...have you checked in other providers, doing two const b1 = vfs.readFileSync('/a');
const b2 = vfs.readFileSync('/a');
b1[0] = 1;
b2[0] = 2;
assert.strictEqual(b1[0], 1); // The buffer is not shared with b2. It otherwise might be surprising to users, because that's not something you'd expect from the usual (I tried to look for it but the diff is getting a bit too big to navigate..) |
|
The diff is close to impossible to navigate at this point :(. |
|
Maybe it makes sense to split the test runner mocks and the SEA/module loading integration out, and keep this focus on the core VFS? Those APIs have their own can of worms to deal with (I guess test runner/SEA has fewer worms, but module loaders have a lot of weird corners). |
Failed to start CI⚠ Commits were pushed since the last approving review: ⚠ - fs: add virtual file system support ⚠ - vfs: add Windows path compatibility ⚠ - sea: support VFS in embedderRequire ⚠ - test: add tmpdir.refresh() to SEA VFS test ⚠ - test: update SEA VFS test for new buildSEA API ⚠ - vfs: add provider-based architecture and node:vfs module ⚠ - vfs: remove backward compat methods, use standard fs API ⚠ - vfs: address review comments ⚠ - vfs: fix lint errors ⚠ - vfs: fix lint errors ⚠ - vfs: remove public createSEA() ⚠ - vfs: address review comments ⚠ - doc: address review comments on VFS documentation ⚠ - doc: clarify virtualCwd behavior in Worker threads ⚠ - vfs: add RealFSProvider for mounting real directories ⚠ - tools: add VFS types to doc type-parser ⚠ - doc: use REPLACEME for version placeholders in vfs.md ⚠ - doc: add security warnings and symlink documentation to vfs.md ⚠ - vfs: address code review feedback from @jasnell ⚠ - vfs: add overlay mode for selective file interception ⚠ - vfs: add tests and fix appendFile, add readonly checks ⚠ - fs: remove createVirtual, use node:vfs instead ⚠ - vfs: add watch and watchFile support ⚠ - vfs: improve test coverage for watch implementation ⚠ - vfs: remove unused utility functions from module_hooks ⚠ - vfs: add test for symlink target creation after symlink ⚠ - doc: add worker thread limitations to VFS documentation ⚠ - doc: clarify VFS accepts same types as fs module ⚠ - vfs: remove unused entries.js, add error tests ⚠ - vfs: remove unused exports from fd.js and stats.js ⚠ - vfs: remove unused VirtualFD methods ⚠ - vfs: remove more unused VirtualFD code ⚠ - vfs: address Aviv's review comments ⚠ - test: improve VFS code coverage ⚠ - Update lib/internal/vfs/module_hooks.js ⚠ - Update lib/internal/vfs/module_hooks.js ⚠ - vfs: address aduh95 review comments ⚠ - vfs: remove SEAProvider export and hasSeaAssets ⚠ - vfs: remove addFile and addDirectory methods ⚠ - doc: alphabetize VirtualFileSystem members in vfs.md ⚠ - Update single-executable-applications.md ⚠ - sea: remove getVfs from public API ⚠ - doc: explain symlink behavior in overlay mode ⚠ - doc: document overlay mode behavior for fs operations ⚠ - vfs: add Symbol.dispose support for automatic unmount ⚠ - vfs: add missing JSDoc @returns for mount() ⚠ - vfs: only initialize SEA VFS when assets exist ⚠ - vfs: use path.posix methods instead of custom implementations ⚠ - vfs: address review feedback for SEA docs and getSeaVfs ⚠ - vfs: refactor SEA VFS with initSeaVfs/getSeaVfs pattern ⚠ - vfs: address test review feedback ⚠ - vfs: replace custom path helpers with standard path module ⚠ - vfs: address remaining review feedback from avivkeller ⚠ - test: expand VFS provider and promises test coverage ⚠ - fixup! test: expand VFS provider and promises test coverage ⚠ - doc: address vfs documentation review feedback ⚠ - test: add Windows-specific VFS mount path tests ⚠ - vfs: address code review feedback ⚠ - vfs: add process vfs-mount and vfs-unmount events ⚠ - vfs: address code review feedback ⚠ - doc: add case sensitivity documentation for VFS overlay mode ⚠ - sea: use wrapModuleLoad for VFS module loading ⚠ - doc: add VFS limitations and cross-references ⚠ - vfs: fix path normalization on Windows ⚠ - sea,vfs: address review feedback from Joyee ⚠ - test: use relative require in SEA VFS test ⚠ - sea,vfs: fix C++ formatting and markdown link ordering ✘ Refusing to run CI on potentially unsafe PRhttps://github.com/nodejs/node/actions/runs/22160773192 |
Failed to start CI⚠ Commits were pushed since the last approving review: ⚠ - fs: add virtual file system support ⚠ - vfs: add Windows path compatibility ⚠ - sea: support VFS in embedderRequire ⚠ - test: add tmpdir.refresh() to SEA VFS test ⚠ - test: update SEA VFS test for new buildSEA API ⚠ - vfs: add provider-based architecture and node:vfs module ⚠ - vfs: remove backward compat methods, use standard fs API ⚠ - vfs: address review comments ⚠ - vfs: fix lint errors ⚠ - vfs: fix lint errors ⚠ - vfs: remove public createSEA() ⚠ - vfs: address review comments ⚠ - doc: address review comments on VFS documentation ⚠ - doc: clarify virtualCwd behavior in Worker threads ⚠ - vfs: add RealFSProvider for mounting real directories ⚠ - tools: add VFS types to doc type-parser ⚠ - doc: use REPLACEME for version placeholders in vfs.md ⚠ - doc: add security warnings and symlink documentation to vfs.md ⚠ - vfs: address code review feedback from @jasnell ⚠ - vfs: add overlay mode for selective file interception ⚠ - vfs: add tests and fix appendFile, add readonly checks ⚠ - fs: remove createVirtual, use node:vfs instead ⚠ - vfs: add watch and watchFile support ⚠ - vfs: improve test coverage for watch implementation ⚠ - vfs: remove unused utility functions from module_hooks ⚠ - vfs: add test for symlink target creation after symlink ⚠ - doc: add worker thread limitations to VFS documentation ⚠ - doc: clarify VFS accepts same types as fs module ⚠ - vfs: remove unused entries.js, add error tests ⚠ - vfs: remove unused exports from fd.js and stats.js ⚠ - vfs: remove unused VirtualFD methods ⚠ - vfs: remove more unused VirtualFD code ⚠ - vfs: address Aviv's review comments ⚠ - test: improve VFS code coverage ⚠ - Update lib/internal/vfs/module_hooks.js ⚠ - Update lib/internal/vfs/module_hooks.js ⚠ - vfs: address aduh95 review comments ⚠ - vfs: remove SEAProvider export and hasSeaAssets ⚠ - vfs: remove addFile and addDirectory methods ⚠ - doc: alphabetize VirtualFileSystem members in vfs.md ⚠ - Update single-executable-applications.md ⚠ - sea: remove getVfs from public API ⚠ - doc: explain symlink behavior in overlay mode ⚠ - doc: document overlay mode behavior for fs operations ⚠ - vfs: add Symbol.dispose support for automatic unmount ⚠ - vfs: add missing JSDoc @returns for mount() ⚠ - vfs: only initialize SEA VFS when assets exist ⚠ - vfs: use path.posix methods instead of custom implementations ⚠ - vfs: address review feedback for SEA docs and getSeaVfs ⚠ - vfs: refactor SEA VFS with initSeaVfs/getSeaVfs pattern ⚠ - vfs: address test review feedback ⚠ - vfs: replace custom path helpers with standard path module ⚠ - vfs: address remaining review feedback from avivkeller ⚠ - test: expand VFS provider and promises test coverage ⚠ - fixup! test: expand VFS provider and promises test coverage ⚠ - doc: address vfs documentation review feedback ⚠ - test: add Windows-specific VFS mount path tests ⚠ - vfs: address code review feedback ⚠ - vfs: add process vfs-mount and vfs-unmount events ⚠ - vfs: address code review feedback ⚠ - doc: add case sensitivity documentation for VFS overlay mode ⚠ - sea: use wrapModuleLoad for VFS module loading ⚠ - doc: add VFS limitations and cross-references ⚠ - vfs: fix path normalization on Windows ⚠ - sea,vfs: address review feedback from Joyee ⚠ - test: use relative require in SEA VFS test ⚠ - sea,vfs: fix C++ formatting and markdown link ordering ✘ Refusing to run CI on potentially unsafe PRhttps://github.com/nodejs/node/actions/runs/22171325106 |
A first-class virtual file system module (
node:vfs) with a provider-based architecture that integrates with Node.js's fs module and module loader.Key Features
Provider Architecture - Extensible design with pluggable providers:
MemoryProvider- In-memory file system with full read/write supportSEAProvider- Read-only access to Single Executable Application assetsVirtualProvider- Base class for creating custom providersStandard fs API - Uses familiar
writeFileSync,readFileSync,mkdirSyncinstead of custom methodsMount Mode - VFS mounts at a specific path prefix (e.g.,
/virtual), clear separation from real filesystemModule Loading -
require()andimportwork seamlessly from virtual filesSEA Integration - Assets automatically mounted at
/seawhen running as a Single Executable ApplicationFull fs Support - readFile, stat, readdir, exists, streams, promises, glob, symlinks
Example
SEA Usage
When running as a Single Executable Application, bundled assets are automatically available:
Public API
Disclaimer: I've used a significant amount of Claude Code tokens to create this PR. I've reviewed all changes myself.