# generation-nodePDF-service **Repository Path**: ericptt/generation-node-pdf-service ## Basic Information - **Project Name**: generation-nodePDF-service - **Description**: 一个简单的 Node.js 服务,使用 Express + Puppeteer 将任意 URL 渲染为 PDF 并以文件流返回。适合被 Java 等后端服务调用进行 PDF 生成。 - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2025-11-28 - **Last Updated**: 2025-11-28 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # Express + Puppeteer PDF 服务 一个简单稳定的 Node.js 服务,使用 Express + Puppeteer 将任意 URL 渲染为 PDF 并以文件流返回。支持 GET/POST、多种纸张与渲染参数、鉴权与 IP 白名单、Docker 运行与中文字体环境。 ## 快速开始 ```powershell # 安装依赖 npm install # 启动服务(默认端口 3031) npm start # 健康检查 curl.exe "http://localhost:3031/health" ``` ## 接口一览 - `GET /health` → 健康检查 `{status:"ok"}` - `GET /pdf?url=...` → 以 Query 传参生成 PDF(返回 `application/pdf` 文件流) - `POST /pdf` → 以 Body 传参生成 PDF(支持 `application/json`、`application/x-www-form-urlencoded`、`multipart/form-data` 仅字段、以及误标的 `text/plain`) ## 参数说明(GET 的 Query / POST 的 Body 通用) - `url`:目标页面 URL(必须,http/https) - `filename`:文件名(不含扩展名),默认 `document` - `format`:纸张,如 `A4`、`Letter`;与 `width/height` 二选一 - `width`/`height`:自定义尺寸(如 `210mm`, `8.27in`) - `landscape`:是否横向,默认 `false` - `printBackground`:打印背景,默认 `true` - `preferCSSPageSize`:优先使用 CSS `@page`,默认 `true` - `scale`:缩放 `0.1~2`,默认 `1` - `timeout`:加载/等待超时(毫秒),默认 `45000` - `waitUntil`:`load|domcontentloaded|networkidle0|networkidle2`,默认 `networkidle2` - `waitFor`:额外等待,毫秒数(如 `1500`)或 CSS 选择器(如 `#ready`) - `marginTop|marginRight|marginBottom|marginLeft`:边距(如 `10mm`) ## 使用示例 GET(自动编码 URL) ```powershell curl.exe --get "http://localhost:3031/pdf" --data-urlencode "url=https://example.com" --output out.pdf ``` POST(JSON 推荐) ```powershell $body = @{ url = "https://example.com"; filename = "example"; waitFor = 1500 } | ConvertTo-Json Invoke-WebRequest -Uri "http://localhost:3031/pdf" -Method Post -ContentType 'application/json' -Body $body -OutFile out.pdf ``` POST(x-www-form-urlencoded) ```bash curl -X POST "http://localhost:3031/pdf" \ -H "Content-Type: application/x-www-form-urlencoded" \ -d "url=https://example.com&filename=example&landscape=true" \ --output out.pdf ``` POST(multipart/form-data,仅字段) ```bash curl -X POST "http://localhost:3031/pdf" \ -F "url=https://example.com" \ -F "filename=example" \ -F "landscape=true" \ --output out.pdf ``` ## Java 调用示例 使用 Java 11+ `HttpClient` 下载 PDF 到本地文件: ```java import java.io.*; import java.net.*; import java.net.http.*; public class PdfClientExample { public static void main(String[] args) throws Exception { String target = "https://exam.r1sx.com/reportshare?uuid=f65809ef-1714-4619-adb4-ac3ae932a173"; String service = "http://localhost:3031/pdf?" + "url=" + URLEncoder.encode(target, java.nio.charset.StandardCharsets.UTF_8); HttpClient client = HttpClient.newHttpClient(); HttpRequest request = HttpRequest.newBuilder() .uri(URI.create(service)) .GET() .build(); HttpResponse resp = client.send(request, HttpResponse.BodyHandlers.ofInputStream()); if (resp.statusCode() == 200) { try (InputStream in = resp.body(); OutputStream out = new FileOutputStream("report.pdf")) { in.transferTo(out); } System.out.println("Saved to report.pdf"); } else { System.err.println("Failed: HTTP " + resp.statusCode()); } } } ``` ## 配置项(环境变量) - `PORT`: 服务端口,默认 `3031` - `HEADLESS`: `1` 或省略为无头,`0` 为有头(调试用) - `PUPPETEER_EXECUTABLE_PATH`/`CHROME_PATH`: 使用本机 Chrome/Chromium,可跳过下载内置浏览器 - `PUPPETEER_NO_SANDBOX`: 设为 `1` 时添加 `--no-sandbox` 参数(某些容器/CI 需要) - `NAVIGATION_TIMEOUT_MS`: 页面加载/等待超时,默认 `45000` - `API_KEY` 或 `API_KEYS`: 配置单个或多个 API Key(逗号分隔)以启用鉴权 - `ALLOW_IPS`: 允许访问的 IP 列表(逗号分隔,支持 `127.0.0.1`、`::1`、`192.168.0.0/16`、`*`)。默认仅允许本机 `127.0.0.1,::1`。 - `TRUST_PROXY`: `1/true` 时启用 `trust proxy`,在反向代理后可正确识别客户端 IP 在 Windows 使用系统 Chrome: ```powershell $env:PUPPETEER_EXECUTABLE_PATH = "C:\\Program Files\\Google\\Chrome\\Application\\chrome.exe" npm start ``` ## 注意事项 - 目标链接若需要登录/鉴权/特定 Cookie,此服务未内置登录流程;可根据需要扩展。 - 建议对外提供时增加白名单或鉴权,避免 SSRF 风险。 - 首次安装 Puppeteer 会下载 Chromium(较大),保证网络通畅。 ## 鉴权示例 启用 API Key 与 IP 白名单: ```powershell $env:API_KEYS = "test-key" $env:ALLOW_IPS = "*" npm start ``` 使用 API Key 访问(GET): ```powershell curl --get "http://localhost:3031/pdf" ` -H "x-api-key: my-key-1" ` --data-urlencode "url=https://example.com" -o out.pdf ``` 使用 API Key 访问(POST): ```powershell $body = @{ url = "https://example.com"; filename = "example" } | ConvertTo-Json Invoke-WebRequest -Uri "http://localhost:3031/pdf" -Method Post -ContentType 'application/json' -Headers @{"x-api-key"="my-key-1"} -Body $body -OutFile example.pdf ``` ## Docker 部署 本仓库自带 `Dockerfile`(基于 puppeteer 官方镜像并安装中文字体)。 ```powershell docker build -t node-pdf-service . docker run --rm -p 3031:3031 --name node-pdf node-pdf-service # 测试 curl.exe --get "http://localhost:3031/pdf" --data-urlencode "url=https://example.com" --output out.pdf ``` 在容器中以 root 运行时已默认启用 `--no-sandbox`(通过 `PUPPETEER_NO_SANDBOX=1`)。 ## Linux 裸机部署(Node 18+ / PM2 / Google Chrome / NGINX) 以 Ubuntu/Debian 为例,其他发行版命令相近(CentOS/RHEL 使用 dnf/yum)。 1) 安装 Node.js 18+ 与 PM2 ```bash curl -fsSL https://deb.nodesource.com/setup_18.x | sudo -E bash - sudo apt-get install -y nodejs node -v sudo npm i -g pm2 ``` https://blog.csdn.net/qq_40743057/article/details/139139574 2) 安装 Google Chrome(官方源) 推荐新方式(keyrings): ```bash # 1. 添加 Chrome 源 wget -q -O - https://dl.google.com/linux/linux_signing_key.pub | sudo apt-key add - echo "deb [arch=amd64] https://dl.google.com/linux/chrome/deb/ stable main" | sudo tee /etc/apt/sources.list.d/google-chrome.list # 2. 安装 Chrome sudo apt update && sudo apt install -y google-chrome-stable # 3. 验证安装(查看版本,确认路径) google-chrome --version which google-chrome # 输出:/usr/bin/google-chrome(后续配置用) ``` (如果你已有旧笔记,可用你熟悉的 apt-key 方案,但 apt-key 已弃用,建议采用上述新方式。) 3) 安装中文字体(避免中文方块) 详见下文“Linux 中文字体”一节。Ubuntu/Debian 快速命令: ```bash sudo apt-get update sudo apt-get install -y fonts-noto-cjk fonts-wqy-zenhei fonts-wqy-microhei fonts-arphic-ukai fonts-arphic-uming sudo fc-cache -f -v ``` 4) 启动服务(PM2) - 简单方式(一次性设置环境变量并启动): ```bash cd /path/to/generation-nodePDF-service npm ci --omit=dev || npm install --only=prod env PUPPETEER_EXECUTABLE_PATH=/usr/bin/google-chrome \ PUPPETEER_NO_SANDBOX=1 \ PORT=3031 \ pm2 start src/server.js --name node-pdf --time pm2 save ``` - 或使用 ecosystem 文件(持久化环境): ```bash cat > ecosystem.config.js <<'EOF' module.exports = { apps: [{ name: 'node-pdf', script: 'src/server.js', env: { NODE_ENV: 'production', PORT: '3031', PUPPETEER_EXECUTABLE_PATH: '/usr/bin/google-chrome', PUPPETEER_NO_SANDBOX: '1' } }] } EOF pm2 start ecosystem.config.js pm2 save pm2 startup systemd # 按提示执行最后一条命令 ``` 5) 配置 NGINX 反向代理(可选) ```bash sudo bash -c 'cat > /etc/nginx/sites-available/node-pdf <<"EOF"' server { listen 80; server_name your-domain.example.com; # 改为你的域名或服务器 IP location / { proxy_pass http://127.0.0.1:3031; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-Proto $scheme; proxy_read_timeout 120s; client_max_body_size 5m; } } EOF' sudo ln -s /etc/nginx/sites-available/node-pdf /etc/nginx/sites-enabled/node-pdf sudo nginx -t && sudo systemctl reload nginx ``` 6) 验证 ```bash curl -sS http://localhost:3031/health curl -X POST "http://localhost:3031/pdf" -H "Content-Type: application/json" -d '{"url":"https://example.com"}' --output out.pdf ``` ## Linux 中文字体(非容器直跑时) 如果不是用容器,需在系统层安装中文字体以避免中文方块: Debian/Ubuntu: ```bash sudo apt-get update sudo apt-get install -y fonts-noto-cjk fonts-wqy-zenhei fonts-wqy-microhei fonts-arphic-ukai fonts-arphic-uming sudo fc-cache -f -v ``` CentOS/RHEL/Fedora: ```bash sudo dnf install -y google-noto-sans-cjk-ttc wqy-zenhei-fonts wqy-microhei-fonts || \ sudo yum install -y google-noto-sans-cjk-ttc wqy-zenhei-fonts wqy-microhei-fonts sudo fc-cache -f -v ``` ## 故障排查 - 启动报错 “Running as root without --no-sandbox” → 设置环境变量: ```bash export PUPPETEER_NO_SANDBOX=1 npm start ``` - POST 报 “Invalid or missing url parameter” → 确认 `Content-Type` 与 Body 格式正确,且 `url` 以 http/https 开头。 - 中文乱码/方块 → 安装中文字体或使用本仓库 Docker 镜像;CSS 中指定中文字体族;代码已设置 `--lang=zh-CN`。 - `waitFor` 数字等待在 v24 使用 Promise 延迟实现;选择器等待使用 `page.waitForSelector`。 ## 故障排查(POST 返回 Invalid or missing url parameter) - Content-Type 不正确:请确保 JSON 请求使用 `-ContentType application/json`;表单请求使用 `-d key=value` 即可,或显式 `-H "Content-Type: application/x-www-form-urlencoded"`。 - JSON 体未正确发送:在 PowerShell 中优先用 `ConvertTo-Json` 生成 `$body`,并用 `Invoke-WebRequest` 发送。 - URL 格式不合法:必须是以 `http://` 或 `https://` 开头的完整 URL。 - 多行命令导致转义错误:在 PowerShell 中避免使用 `^` 续行;若需多行,使用反引号 `` ` ``,或改为单行。 ## 许可证 MIT ## Spring Boot 集成(封装类) 以下提供基于 `RestTemplate` 的同步调用封装,支持 GET(query 参数)与 POST(JSON)两种方式,自动携带 API Key,并将响应直接流式写入文件,避免占用大量内存。 依赖(Spring Boot Web 已包含 `RestTemplate`): ```xml org.springframework.boot spring-boot-starter-web ``` 配置(application.yml): ```yaml pdf: client: base-url: http://localhost:3031 api-key: test-key # 如未启用鉴权可留空 connect-timeout-ms: 5000 read-timeout-ms: 60000 ``` 属性类: ```java // src/main/java/com/example/pdf/PdfClientProperties.java package com.example.pdf; import org.springframework.boot.context.properties.ConfigurationProperties; @ConfigurationProperties(prefix = "pdf.client") public class PdfClientProperties { private String baseUrl; private String apiKey; private int connectTimeoutMs = 5000; private int readTimeoutMs = 60000; // getters/setters public String getBaseUrl() { return baseUrl; } public void setBaseUrl(String baseUrl) { this.baseUrl = baseUrl; } public String getApiKey() { return apiKey; } public void setApiKey(String apiKey) { this.apiKey = apiKey; } public int getConnectTimeoutMs() { return connectTimeoutMs; } public void setConnectTimeoutMs(int connectTimeoutMs) { this.connectTimeoutMs = connectTimeoutMs; } public int getReadTimeoutMs() { return readTimeoutMs; } public void setReadTimeoutMs(int readTimeoutMs) { this.readTimeoutMs = readTimeoutMs; } } ``` 配置与 Bean: ```java // src/main/java/com/example/pdf/PdfClientConfig.java package com.example.pdf; import org.springframework.boot.context.properties.EnableConfigurationProperties; import org.springframework.context.annotation.Bean; import org.springframework.context.annotation.Configuration; import org.springframework.http.client.SimpleClientHttpRequestFactory; import org.springframework.web.client.RestTemplate; @Configuration @EnableConfigurationProperties(PdfClientProperties.class) public class PdfClientConfig { @Bean public RestTemplate restTemplate(PdfClientProperties props) { var f = new SimpleClientHttpRequestFactory(); f.setConnectTimeout(props.getConnectTimeoutMs()); f.setReadTimeout(props.getReadTimeoutMs()); return new RestTemplate(f); } } ``` 参数对象: ```java // src/main/java/com/example/pdf/PdfRequestOptions.java package com.example.pdf; public class PdfRequestOptions { private String filename; // 不含扩展名 private String format = "A4"; // 或 width/height private Boolean landscape; // true/false private Boolean printBackground; // true/false private Boolean preferCSSPageSize; // true/false private Double scale; // 0.1 ~ 2 private Long timeout; // ms private String waitUntil; // load|domcontentloaded|networkidle0|networkidle2 private String waitFor; // ms 或 选择器 private String width; // 210mm 等 private String height; // 297mm 等 // getters/setters 省略 // ... } ``` 封装客户端: ```java // src/main/java/com/example/pdf/PdfServiceClient.java package com.example.pdf; import org.springframework.http.*; import org.springframework.util.LinkedMultiValueMap; import org.springframework.util.MultiValueMap; import org.springframework.web.client.RestTemplate; import org.springframework.web.util.UriComponentsBuilder; import java.io.FileOutputStream; import java.io.OutputStream; import java.net.URI; import java.nio.file.Files; import java.nio.file.Path; import java.util.HashMap; import java.util.Map; public class PdfServiceClient { private final RestTemplate restTemplate; private final PdfClientProperties props; public PdfServiceClient(RestTemplate restTemplate, PdfClientProperties props) { this.restTemplate = restTemplate; this.props = props; } public void downloadPdfGet(String targetUrl, Path output, PdfRequestOptions opt) throws Exception { MultiValueMap q = new LinkedMultiValueMap<>(); q.add("url", targetUrl); if (opt != null) { putIfNotNull(q, "filename", opt.getFilename()); putIfNotNull(q, "format", opt.getFormat()); putIfNotNull(q, "landscape", toStringOrNull(opt.getLandscape())); putIfNotNull(q, "printBackground", toStringOrNull(opt.getPrintBackground())); putIfNotNull(q, "preferCSSPageSize", toStringOrNull(opt.getPreferCSSPageSize())); putIfNotNull(q, "scale", toStringOrNull(opt.getScale())); putIfNotNull(q, "timeout", toStringOrNull(opt.getTimeout())); putIfNotNull(q, "waitUntil", opt.getWaitUntil()); putIfNotNull(q, "waitFor", opt.getWaitFor()); putIfNotNull(q, "width", opt.getWidth()); putIfNotNull(q, "height", opt.getHeight()); } URI uri = UriComponentsBuilder.fromHttpUrl(props.getBaseUrl()) .path("/pdf") .queryParams(q) .build(true) .toUri(); HttpHeaders headers = new HttpHeaders(); if (props.getApiKey() != null && !props.getApiKey().isBlank()) { headers.set("x-api-key", props.getApiKey()); } RequestEntity req = new RequestEntity<>(headers, HttpMethod.GET, uri); streamToFile(req, output); } public void downloadPdfPost(String targetUrl, Path output, PdfRequestOptions opt) throws Exception { Map body = new HashMap<>(); body.put("url", targetUrl); if (opt != null) { putIfNotNull(body, "filename", opt.getFilename()); putIfNotNull(body, "format", opt.getFormat()); putIfNotNull(body, "landscape", opt.getLandscape()); putIfNotNull(body, "printBackground", opt.getPrintBackground()); putIfNotNull(body, "preferCSSPageSize", opt.getPreferCSSPageSize()); putIfNotNull(body, "scale", opt.getScale()); putIfNotNull(body, "timeout", opt.getTimeout()); putIfNotNull(body, "waitUntil", opt.getWaitUntil()); putIfNotNull(body, "waitFor", opt.getWaitFor()); putIfNotNull(body, "width", opt.getWidth()); putIfNotNull(body, "height", opt.getHeight()); } URI uri = UriComponentsBuilder.fromHttpUrl(props.getBaseUrl()) .path("/pdf") .build(true) .toUri(); HttpHeaders headers = new HttpHeaders(); headers.setContentType(MediaType.APPLICATION_JSON); if (props.getApiKey() != null && !props.getApiKey().isBlank()) { headers.set("x-api-key", props.getApiKey()); } RequestEntity> req = new RequestEntity<>(body, headers, HttpMethod.POST, uri); streamToFile(req, output); } private void streamToFile(RequestEntity request, Path output) throws Exception { Files.createDirectories(output.toAbsolutePath().getParent()); restTemplate.execute(request.getUrl(), request.getMethod(), clientHttpRequest -> { // headers request.getHeaders().forEach((k, v) -> clientHttpRequest.getHeaders().put(k, v)); // body if (request.hasBody()) { var mapper = new com.fasterxml.jackson.databind.ObjectMapper(); byte[] json = mapper.writeValueAsBytes(request.getBody()); clientHttpRequest.getBody().write(json); } }, clientHttpResponse -> { if (clientHttpResponse.getStatusCode().is2xxSuccessful()) { try (OutputStream out = new FileOutputStream(output.toFile())) { clientHttpResponse.getBody().transferTo(out); } return null; } throw new IllegalStateException("HTTP " + clientHttpResponse.getStatusCode()); }); } private static void putIfNotNull(MultiValueMap map, String k, String v) { if (v != null && !v.isBlank()) map.add(k, v); } private static void putIfNotNull(Map map, String k, Object v) { if (v != null) map.put(k, v); } private static String toStringOrNull(Object o) { return o == null ? null : String.valueOf(o); } } ``` 使用示例: ```java // src/main/java/com/example/demo/DemoService.java package com.example.demo; import com.example.pdf.*; import org.springframework.stereotype.Service; import java.nio.file.Path; @Service public class DemoService { private final PdfServiceClient pdfClient; public DemoService(PdfServiceClient pdfClient) { this.pdfClient = pdfClient; } public void exportReport() throws Exception { var opts = new PdfRequestOptions(); opts.setWaitUntil("networkidle0"); opts.setWaitFor("3000"); opts.setPrintBackground(true); pdfClient.downloadPdfGet( "https://exam.r1sx.com/reportshare?uuid=f65809ef-1714-4619-adb4-ac3ae932a173", Path.of("reports/report.pdf"), opts ); } } ``` 注:若服务端启用了 IP 白名单,请确保 Spring Boot 调用端 IP 被允许,或在开发时将 `ALLOW_IPS` 设置为 `*`。若服务端启用 API Key,请在 `application.yml` 中配置 `pdf.client.api-key`。