Skip to content

Build crashing when metafile size > 512MB #4329

@elbywan

Description

@elbywan

👋 Kudos for this awesome build tool!

In a nutshell

When bundling very large projects with the metafile: true flag the build crashes with the following error.

Error: Cannot create a string longer than 0x1fffffe8 characters
    at TextDecoder.decode (node:internal/encoding:447:16)
    at decodeUTF8 (/Users/xxx/dev/web-ui/.yarn/cache/esbuild-npm-0.25.10-f26f7be387-a8e4d33d7e.zip/node_modules/esbuild/lib/main.js:188:35)
    at visit (/Users/xxx/dev/web-ui/.yarn/cache/esbuild-npm-0.25.10-f26f7be387-a8e4d33d7e.zip/node_modules/esbuild/lib/main.js:99:16)
    at visit (/Users/xxx/dev/web-ui/.yarn/cache/esbuild-npm-0.25.10-f26f7be387-a8e4d33d7e.zip/node_modules/esbuild/lib/main.js:114:43)
    at decodePacket (/Users/xxx/dev/web-ui/.yarn/cache/esbuild-npm-0.25.10-f26f7be387-a8e4d33d7e.zip/node_modules/esbuild/lib/main.js:126:15)
    at handleIncomingPacket (/Users/xxx/dev/web-ui/.yarn/cache/esbuild-npm-0.25.10-f26f7be387-a8e4d33d7e.zip/node_modules/esbuild/lib/main.js:651:18)
    at Socket.readFromStdout (/Users/xxx/dev/web-ui/.yarn/cache/esbuild-npm-0.25.10-f26f7be387-a8e4d33d7e.zip/node_modules/esbuild/lib/main.js:581:7)
    at Socket.emit (node:events:524:28)
    at Socket.emit (node:domain:489:12)
    at addChunk (node:internal/streams/readable:561:12) {
  code: 'ERR_STRING_TOO_LONG'

Context

We are currently using esbuild in our company to build the application as part of a huge monorepo (>10M LOC).
The metafile flag is a requirement for us as we use specific build plugins to process the data.

The codebase is constantly growing and we faced the error recently.

From what I could gather this is caused by:

  • The maximum size for a string in the v8 engine is 512MB (defined here)
  • The node process is storing the metafile value into a string before calling JSON.parse() - exceeding the threshold

I also noticed that the JSON was not minified, so as a (very dirty) workaround and to buy us some time I patched lib/main.js to chunk the data and minify the JSON on the fly to reduce the final string length.

(like this)
diff --git a/lib/main.js b/lib/main.js
index 0f61c81621ded9262d532307857e252673c76473..db4b1b9f0cd8d2c74fa380630c7aff8b666859ee 100644
--- a/lib/main.js
+++ b/lib/main.js
@@ -178,6 +178,59 @@ var ByteBuffer = class {
     return bytes;
   }
 };
+
+// [BEGIN PATCH]
+const TEMP_BUFFER_WINDOW = 100_000;
+const decodeWithFallback = (decodeFn) => (bytes) => {
+  try {
+    // Attempt to decode the bytes info an UTF8 string
+    return decodeFn(bytes);
+  } catch (error) {
+    // Ouch, it failed :(
+    // This likely means that the bytes array is too big and won't fit into
+    // a single node.js (v8) string as it exceeds the 512MB limit
+    const { buffer, byteOffset, byteLength } = bytes;
+    const buf = Buffer.from(buffer, byteOffset, byteLength);
+
+    const now = performance.now();
+    const tmpFolder = require("node:os").tmpdir();
+    const filePath = require("node:path").join(
+      tmpFolder,
+      `esbuild-packet-${now}.json`
+    );
+
+    console.log(`[!] Overweight esbuild JSON message (${byteLength} bytes)`);
+    console.log(`    Attempting to minify… (using temporary file: ${filePath})`);
+
+    const fd = fs.openSync(filePath, "w");
+
+    // Now, we know that these string are representing JSON so we can read part of the message
+    // and "minify it" piece by piece (it comes prettified from the go side with a lot of unnecessary white space)
+    try {
+      let offset = 0;
+      while (true) {
+        const tempStr = buf
+          .slice(offset, offset + TEMP_BUFFER_WINDOW)
+          .toString()
+          .replaceAll(/\s*[\r\n]\s*/g, "")
+          .replaceAll(/"([^"])":\s"/g, '"$1":"');
+        fs.writeFileSync(fd, tempStr);
+        if (offset >= buf.length) {
+          break;
+        }
+        offset = offset + TEMP_BUFFER_WINDOW;
+      }
+
+      console.log(`    Done minifying, final size: ${fs.statSync(filePath).size} bytes`);
+
+      return fs.readFileSync(filePath, "utf-8");
+    } finally {
+      fs.closeSync(fd);
+    }
+  }
+};
+// [END PATCH]
+
 var encodeUTF8;
 var decodeUTF8;
 var encodeInvariant;
@@ -185,14 +238,16 @@ if (typeof TextEncoder !== "undefined" && typeof TextDecoder !== "undefined") {
   let encoder = new TextEncoder();
   let decoder = new TextDecoder();
   encodeUTF8 = (text) => encoder.encode(text);
-  decodeUTF8 = (bytes) => decoder.decode(bytes);
+  // [PATCHED]
+  decodeUTF8 = decodeWithFallback((bytes) => decoder.decode(bytes));
   encodeInvariant = 'new TextEncoder().encode("")';
 } else if (typeof Buffer !== "undefined") {
   encodeUTF8 = (text) => Buffer.from(text);
-  decodeUTF8 = (bytes) => {
+  // [PATCHED]
+  decodeUTF8 = decodeWithFallback((bytes) => {
     let { buffer, byteOffset, byteLength } = bytes;
     return Buffer.from(buffer, byteOffset, byteLength).toString();
-  };
+  });
   encodeInvariant = 'Buffer.from("")';
 } else {
   throw new Error("No UTF-8 codec found");

Now this is brittle - and I'm not sure how long this will hold - so I'm wondering:

  • is there any reason to not minify from the go side before sending?
  • do you think that it would be possible to stream the JSON and instantiate the js metafile object using a pull parser instead of calling JSON.parse()
  • or if you had any other thoughts on the subject

Thanks! 🙇

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions