diff --git a/README.en.md b/README.en.md index 5ac7ab24bcde6266fe6ce4f534a80d950f9f4327..232ef3dd76a6f80bf05e539d4e722929bcba41a5 100644 --- a/README.en.md +++ b/README.en.md @@ -8,11 +8,11 @@ This example utilizes the ArkTS API provided by `@ohos.ai.mindSporeLite` to impl | ![Home Page](screenshots/device/screenshot_001.png "Home Page") | ![Album Page](screenshots/device/screenshot_002.png "Album Page") | ![Original Image Preview Page](screenshots/device/screenshot_003.png "Original Image Preview Page") | ![Portrait Segmentation and Composition Results Page](screenshots/device/screenshot_004.png "Portrait Segmentation and Composition Results Page") | ## Usage Instructions -1. On the image segmentation page, click the "Image Composition" button to enter the album selection interface. -2. In the album interface, select a portrait image and click the "Confirm" button. -3. After selecting the image, the application will navigate to the image composition page, with the "Original" tab selected by default to display the chosen image. -4. Clicking the "Composition" tab will automatically perform portrait segmentation model inference on the first background image and the original image, with the results displayed on the main interface. -5. Under the "Composition" tab, any background from the list can be selected for portrait segmentation inference, and the results will be displayed in real-time on the main interface. +1. On the image segmentation page, you can click the image synthesis button to enter the album selection interface. +2. In the album interface, select `a portrait image` (developers are advised to use a 1:1 aspect ratio portrait image to ensure the best synthesis effect), and click the confirm button. +3. After selecting the image, you will be redirected to the image synthesis page, where the original image tab is selected by default and the selected original image is displayed. +4. Clicking the synthesis tab will automatically perform portrait image segmentation inference on the first background image and the original image (the inference process involves loading the model into memory and executing the inference. Since the model is large, this process can be time-consuming. Developers can refer to [@ohos.ai.mindSporeLite](https://developer.huawei.com/consumer/cn/doc/harmonyos-references/js-apis-mindsporelite) for configuring NPU inference to improve inference efficiency). The results of the inference will be displayed on the main interface. +5. Under the synthesis tab, you can select any background from the list to perform portrait image segmentation inference, and the inference results will be displayed on the main interface. ## Project Directory ``` @@ -27,13 +27,13 @@ This example utilizes the ArkTS API provided by `@ohos.ai.mindSporeLite` to impl │ ├──pages │ │ ├──Index.ets // Home page for fetching album images │ │ └──ImageGenerate.ets // Original and composite image preview interface -│ └──util +│ └──utils │ ├──Gaussion.ets // Gaussian filtering algorithm utility class │ ├──Logger.ets // Logging tool class │ └──Predict.ets // Model inference implementation └──entry/src/main/resources/ └──rawfile - └──model_float32.ms // Stored model file + └──rmbg_fp16.ms // Stored model file ``` ## Implementation Details diff --git a/README.md b/README.md index 109a50b877b6cd94bb31eb76e33bc09ff0aaf516..874754b54e7ea1154a4fe03b4a861b346ab5993c 100644 --- a/README.md +++ b/README.md @@ -9,11 +9,11 @@ 使用说明 -1. 在图像分割页面,可以点击图像合成按钮,进入相册选择图片界面 -2. 在相册界面,选择`一张人物图像`,点击确定按钮 -3. 图像选择好后,会跳转到图像合成页面,默认选中原图 tab并展示选择后的原图 -4. 点击合成 tab 会默认对第一张背景图和原图进行人物图像分割模型推理,推理的结果会显示在主界面 -5. 在合成 tab 下,可以选择列表中的任意背景,进行人物图像分割推理,并实时在主界面展示推理结果 +1. 在图像分割页面,可以点击图像合成按钮,进入相册选择图片界面。 +2. 在相册界面,选择`一张人物图像`(建议开发者使用1:1尺寸的人物图像以保证最佳合成效果),点击确定按钮。 +3. 图像选择好后,会跳转到图像合成页面,默认选中原图tab并展示选择后的原图。 +4. 点击合成tab会默认对第一张背景图和原图进行人物图像分割推理(推理过程会涉及模型加载到内存,执行推理,由于模型较大,这个过程会比较耗时,开发者可以参考[@ohos.ai.mindSporeLite](https://developer.huawei.com/consumer/cn/doc/harmonyos-references/js-apis-mindsporelite)配置NPU推理以提高推理效率),推理的结果会显示在主界面。 +5. 在合成tab下,可以选择列表中的任意背景,进行人物图像分割推理,并在主界面展示推理结果。 ## 工程目录 ``` @@ -28,24 +28,24 @@ │ ├──pages │ │ ├──Index.ets // 首页,获取相册图片 │ │ └──ImageGenerate.ets // 原图和合成图预览界面 -│ └──util +│ └──utils │ ├──Gaussion.ets // 高斯滤波算法工具类 │ ├──Logger.ets // 日志工具类 │ └──Predict.ets // 模型推理实现 └──entry/src/main/resources/ └──rawfile - └──model_float32.ms // 存放的模型文件:model_float32.ms + └──rmbg_fp16.ms // 存放的模型文件:rmbg_fp16.ms ``` ## 具体实现 本示例程序中使用的终端图像分割模型为`model_float32.ms`,放置在`entry\src\main\resources\rawfile`工程目录下。 -- 首页调用[@ohos.file.photoAccessHelper](https://developer.huawei.com/consumer/cn/doc/harmonyos-references/js-apis-photoaccesshelper) (图片文件选择)拉起相册、[@ohos.multimedia.image](https://developer.huawei.com/consumer/cn/doc/harmonyos-references/js-apis-image) (图片处理效果)、[@ohos.file.fs](https://developer.huawei.com/consumer/cn/doc/harmonyos-references/js-apis-file-fs) (基础文件操作) 等API实现相册图片获取及图片处理。完整代码请参见[Index.ets](entry/src/main/ets/pages/Index.ets)、[ImageGenerate.ets](entry/src/main/ets/pages/ImageGenerate.ets) +- 首页调用[@ohos.file.photoAccessHelper](https://developer.huawei.com/consumer/cn/doc/harmonyos-references/js-apis-photoaccesshelper)(图片文件选择)拉起相册、[@ohos.multimedia.image](https://developer.huawei.com/consumer/cn/doc/harmonyos-references/js-apis-image) (图片处理效果)、[@ohos.file.fs](https://developer.huawei.com/consumer/cn/doc/harmonyos-references/js-apis-file-fs) (基础文件操作) 等API实现相册图片获取及图片处理。完整代码请参见[Index.ets](entry/src/main/ets/pages/Index.ets)、[ImageGenerate.ets](entry/src/main/ets/pages/ImageGenerate.ets)。 -- 图像合成页调用[@ohos.ai.mindSporeLite](https://developer.huawei.com/consumer/cn/doc/harmonyos-references/js-apis-mindsporelite) (推理能力) API实现端侧推理。完整代码请参见[Predict.ets](entry/src/main/ets/utils/Predict.ets) +- 图像合成页调用[@ohos.ai.mindSporeLite](https://developer.huawei.com/consumer/cn/doc/harmonyos-references/js-apis-mindsporelite) (推理能力) API实现端侧推理。完整代码请参见[Predict.ets](entry/src/main/ets/utils/Predict.ets)。 -- 调用推理函数并处理结果。完整代码请参见[ImageGenerate.ets](entry/src/main/ets/pages/ImageGenerate.ets) +- 调用推理函数并处理结果。完整代码请参见[ImageGenerate.ets](entry/src/main/ets/pages/ImageGenerate.ets)。 ## 时序流程图 ![](./screenshots/Timing_digram.PNG) diff --git a/entry/build-profile.json5 b/entry/build-profile.json5 index 38bdcc9929e2c5bd7f51c4fc96a398ccebd0d6ce..9fc7ba4e80809ea5a77d7131ee023beb4290c294 100644 --- a/entry/build-profile.json5 +++ b/entry/build-profile.json5 @@ -2,7 +2,6 @@ "apiType": "stageMode", "buildOption": { "externalNativeOptions": { - "path": "./src/main/cpp/CMakeLists.txt", "arguments": "", "cppFlags": "", } diff --git a/entry/src/main/ets/common/constants/ImageDataListConstant.ets b/entry/src/main/ets/common/constants/ImageDataListConstant.ets index 7d3b729d5f6cacc3b348d81e5012874ffef23b79..b723c4a7c71fb39c7f7d42ba54efefea0a9ee1c4 100644 --- a/entry/src/main/ets/common/constants/ImageDataListConstant.ets +++ b/entry/src/main/ets/common/constants/ImageDataListConstant.ets @@ -16,10 +16,10 @@ /** * Model input height and width constants */ -export const MODEL_INPUT_HEIGHT = 192; -export const MODEL_INPUT_WIDTH = 192; -export const BLACKGROUND_THEADHOLD = 0.5; -export const MODEL_NAME = "model_float32.ms"; +export const MODEL_INPUT_HEIGHT = 1024; +export const MODEL_INPUT_WIDTH = 1024; +export const BLACKGROUND_THEADHOLD = 0.88; +export const MODEL_NAME = "rmbg_fp16.ms"; /** * List data constants for all features diff --git a/entry/src/main/ets/pages/ImageGenerate.ets b/entry/src/main/ets/pages/ImageGenerate.ets index a6ce4c89a843d1d64c3a7964b69cb5905d708671..af1873639c9f500ac193498c3e920d7cd6bb338a 100644 --- a/entry/src/main/ets/pages/ImageGenerate.ets +++ b/entry/src/main/ets/pages/ImageGenerate.ets @@ -134,6 +134,7 @@ export struct ImageGenerate { backgroundBlurStyle: BlurStyle.BACKGROUND_THICK }); @State currentTab: number = 0; + @State canMergeImage: boolean = true; @State tabSelectedIndexes: number[] = [this.currentTab]; @State currentPreviewImageIndex: number = 0; @State previewImageList: BackgroundImageItem[] = [ @@ -151,7 +152,7 @@ export struct ImageGenerate { handleSegmentButtonChange: Callback = (index: number) => { this.currentTab = index; // combine image - if (this.currentTab === 1) { + if (this.currentTab === 1 && this.canMergeImage) { this.handleMergeImage(); } } @@ -159,8 +160,9 @@ export struct ImageGenerate { let resMgr: resourceManager.ResourceManager = this.getUIContext().getHostContext()?.getApplicationContext().resourceManager as resourceManager.ResourceManager; if (!resMgr) { - return + return; } + this.canMergeImage = false; const modelBuffer = resMgr.getRawFileContentSync(this.modelName); // Preprocess image data try { @@ -197,7 +199,7 @@ export struct ImageGenerate { let float32View = new Float32Array(MODEL_INPUT_HEIGHT * MODEL_INPUT_WIDTH * 3); let means = [0.5, 0.5, 0.5]; - let stds = [0.5, 0.5, 0.5]; + let stds = [1.0, 1.0, 1.0]; let index = 0; for (let i = 0; i < imageArr.length; i++) { if ((i + 1) % 4 === 0) { @@ -220,7 +222,7 @@ export struct ImageGenerate { let isHumanArr = new Array(MODEL_INPUT_HEIGHT * MODEL_INPUT_WIDTH); for (let i = 0; i < picArr.length; i++) { if ((i + 1) % 4 === 0) { - let isHuman = output[(i + 1) / 2 - 1] > BLACKGROUND_THEADHOLD; + let isHuman = output[(i + 1) / 4 - 1] > BLACKGROUND_THEADHOLD; isHumanArr[(i + 1) / 4] = isHuman; if (isHuman) { picArr[i - 1] = imageArr[i - 3]; @@ -258,33 +260,36 @@ export struct ImageGenerate { let infoBg = await pixelMapBg.getImageInfo(); pixelMapBg.scaleSync(MODEL_INPUT_WIDTH / infoBg.size.width, MODEL_INPUT_HEIGHT / infoBg.size.height); let readBufferBg = new ArrayBuffer(MODEL_INPUT_HEIGHT * MODEL_INPUT_WIDTH * 4); - pixelMapBg.readPixelsToBuffer(readBufferBg).then(() => { - let bgArr = new Uint8Array(readBufferBg.slice(0, MODEL_INPUT_HEIGHT * MODEL_INPUT_WIDTH * 4)); - let resArr = new Uint8Array(MODEL_INPUT_HEIGHT * MODEL_INPUT_WIDTH * 4); - for (let i = 0; i < resArr.length; ++i) { - if ((i + 1) % 4 === 0) { - if (isHumanArr[(i + 1) / 4 - 1]) { - resArr[i - 1] = picArr[i - 1]; - resArr[i - 2] = picArr[i - 2]; - resArr[i - 3] = picArr[i - 3]; - resArr[i] = 255; - } else { - resArr[i - 1] = bgArr[i - 3]; - resArr[i - 2] = bgArr[i - 2]; - resArr[i - 3] = bgArr[i - 1]; - resArr[i] = 255; - } + pixelMapBg.readPixelsToBufferSync(readBufferBg); + let bgArr = new Uint8Array(readBufferBg.slice(0, MODEL_INPUT_HEIGHT * MODEL_INPUT_WIDTH * 4)); + let resArr = new Uint8Array(MODEL_INPUT_HEIGHT * MODEL_INPUT_WIDTH * 4); + for (let i = 0; i < resArr.length; ++i) { + if ((i + 1) % 4 === 0) { + if (isHumanArr[(i + 1) / 4 - 1]) { + resArr[i - 1] = picArr[i - 1]; + resArr[i - 2] = picArr[i - 2]; + resArr[i - 3] = picArr[i - 3]; + resArr[i] = 255; + } else { + resArr[i - 1] = bgArr[i - 3]; + resArr[i - 2] = bgArr[i - 2]; + resArr[i - 3] = bgArr[i - 1]; + resArr[i] = 255; } } - let opts: image.InitializationOptions = { - editable: true, pixelFormat: image.PixelMapFormat.RGBA_8888, - size: { height: MODEL_INPUT_HEIGHT, width: MODEL_INPUT_WIDTH } - }; - // Gaussian filtering: This step is not mandatory - // gaussianRun(resArr) - pixelMapBg.release(); - this.outMergePixMap = image.createPixelMapSync(resArr.buffer, opts); - }) + } + let opts: image.InitializationOptions = { + editable: true, pixelFormat: image.PixelMapFormat.RGBA_8888, + size: { height: MODEL_INPUT_HEIGHT, width: MODEL_INPUT_WIDTH } + }; + // Gaussian filtering: This step is not mandatory + // gaussianRun(resArr) + image.createPixelMap(resArr.buffer, opts) + .then(outMergeImage => { + this.outMergePixMap = outMergeImage; + this.canMergeImage = true; + }) + pixelMapBg.release(); imageSourceBg.release(); }) } catch (error) { @@ -341,6 +346,14 @@ export struct ImageGenerate { ForEach(this.previewImageList, (item: BackgroundImageItem, index: number) => { PreviewImage({ imgItem: item }) .onClick(() => { + if (!this.canMergeImage) { + this.getUIContext().getPromptAction().showToast({ + message: "图像合成中!请勿切换背景", + alignment: Alignment.Center, + }) + return; + } + this.previewImageList[index].selected = true; this.currentPreviewImageIndex = index; this.handleMergeImage(); diff --git a/entry/src/main/ets/pages/Index.ets b/entry/src/main/ets/pages/Index.ets index 5955f817ede21c23ecd35da7be7fffccc4cebae0..c193e3871b1823f53e6766874399711dd8304338 100644 --- a/entry/src/main/ets/pages/Index.ets +++ b/entry/src/main/ets/pages/Index.ets @@ -26,6 +26,18 @@ struct Index { build() { Navigation(this.pathStack) { RelativeContainer() { + Text($r("app.string.home_text_generate_tips")) + .id("id_home_tips") + .alignRules({ + middle: { anchor: "__container__", align: HorizontalAlign.Center}, + bottom: { anchor: "id_home_btn", align: VerticalAlign.Top } + }) + .fontWeight(400) + .fontSize(12) + .lineHeight(16) + .fontColor("rgba(0,0,0,0.6)") + .margin({ bottom: 24}) + Button($r("app.string.home_btn_text")) .id("id_home_btn") .alignRules({ diff --git a/entry/src/main/ets/utils/Predict.ets b/entry/src/main/ets/utils/Predict.ets index 3004e850ee976fc930d001da39235467db41931a..5f0c457a88171c909e9ea1067827b18cd47cd16e 100644 --- a/entry/src/main/ets/utils/Predict.ets +++ b/entry/src/main/ets/utils/Predict.ets @@ -20,7 +20,7 @@ export default async function modelPredict (modelBuffer: ArrayBuffer, inputsBuffer: ArrayBuffer[]): Promise { // 1. Create Context let context: mindSporeLite.Context = {}; - context. target = ['cpu']; + context.target = ['cpu']; context.cpu = {}; context.cpu.threadNum = 2; context.cpu.threadAffinityMode = 1; diff --git a/entry/src/main/resources/base/element/string.json b/entry/src/main/resources/base/element/string.json index 962283f3b0f9216540c8c0bee443db6f25c34e50..d825eaf2ce78b2bbd79ecaefdbde1cf225e24d31 100644 --- a/entry/src/main/resources/base/element/string.json +++ b/entry/src/main/resources/base/element/string.json @@ -10,62 +10,66 @@ }, { "name": "EntryAbility_label", - "value": "图像分割合成" + "value": "Image Segmentation and Composition" }, { "name": "home_btn_text", - "value": "图像合成" + "value": "Image compositing" + }, + { + "name": "home_text_generate_tips", + "value": "It is recommended to use a 1:1 size image of a person to guarantee the best compositing" }, { "name": "home_navbar_title", - "value": "图像分割" + "value": "Image Segmentation" }, { "name": "image_page_navbar_title", - "value": "图像合成" + "value": "Image composition" }, { "name": "segment_btn_text_first", - "value": "原图" + "value": "origin" }, { "name":"segment_btn_text_second", - "value": "合成" + "value": "generate" }, { "name": "preview_img_list_001", - "value": "默认" + "value": "default" }, { "name": "preview_img_list_002", - "value": "蓝色" + "value": "blue" }, { "name": "preview_img_list_003", - "value": "红色" + "value": "red" }, { "name": "preview_img_list_004", - "value": "背景1" + "value": "bg1" }, { "name": "preview_img_list_005", - "value": "背景2" + "value": "bg2" }, { "name": "preview_img_list_006", - "value": "背景3" + "value": "bg3" }, { "name": "preview_img_list_007", - "value": "背景4" + "value": "bg4" }, { "name": "preview_img_list_008", - "value": "背景5" + "value": "bg5" }, { "name": "preview_img_list_009", - "value": "背景6" + "value": "bg6" }] } \ No newline at end of file diff --git a/entry/src/main/resources/en_US/element/string.json b/entry/src/main/resources/en_US/element/string.json index 837af6c3d1ace58d4a2637a63d6dc6c62f0220e1..6ac94280f482a66b65d4c21f49e427c890a3db9f 100644 --- a/entry/src/main/resources/en_US/element/string.json +++ b/entry/src/main/resources/en_US/element/string.json @@ -16,6 +16,10 @@ "name": "home_btn_text", "value": "Image compositing" }, + { + "name": "home_text_generate_tips", + "value": "Use a 1:1 person image for the best compositing." + }, { "name": "home_navbar_title", "value": "Image Segmentation" diff --git a/entry/src/main/resources/rawfile/model_float32.ms b/entry/src/main/resources/rawfile/model_float32.ms deleted file mode 100644 index 46bab9d013e28b32d8bddbed5ced71ac11ad8761..0000000000000000000000000000000000000000 Binary files a/entry/src/main/resources/rawfile/model_float32.ms and /dev/null differ diff --git a/entry/src/main/resources/rawfile/rmbg_fp16.ms b/entry/src/main/resources/rawfile/rmbg_fp16.ms new file mode 100644 index 0000000000000000000000000000000000000000..3cda938927af408f5eeff78225e8065e5fda523c Binary files /dev/null and b/entry/src/main/resources/rawfile/rmbg_fp16.ms differ diff --git a/entry/src/main/resources/zh_CN/element/string.json b/entry/src/main/resources/zh_CN/element/string.json index 962283f3b0f9216540c8c0bee443db6f25c34e50..77e2265f8b2e8ec30b2be42d5782c697e1633374 100644 --- a/entry/src/main/resources/zh_CN/element/string.json +++ b/entry/src/main/resources/zh_CN/element/string.json @@ -16,6 +16,10 @@ "name": "home_btn_text", "value": "图像合成" }, + { + "name": "home_text_generate_tips", + "value": "建议使用 1:1 尺寸的人物图像以保证最佳合成效果" + }, { "name": "home_navbar_title", "value": "图像分割" diff --git a/screenshots/Timing_digram.PNG b/screenshots/Timing_digram.PNG index addd015d1dcaed7aad869380c6422a1ff272a585..b931a845e9a4280a35c8059f00a29d2a6fd4bc8e 100644 Binary files a/screenshots/Timing_digram.PNG and b/screenshots/Timing_digram.PNG differ diff --git a/screenshots/device/screenshot_001.png b/screenshots/device/screenshot_001.png index 135625698226a15a5838e6529551009c4380b463..f7db2dadeddc08975e149e05aff3e3d81ecdb0e3 100644 Binary files a/screenshots/device/screenshot_001.png and b/screenshots/device/screenshot_001.png differ diff --git a/screenshots/device/screenshot_002.png b/screenshots/device/screenshot_002.png index d07a0554e1d13b7a362322911548c1ed0b2b1b36..4bd1a13caf673c2bb04a3152adac7b95ef52c0e9 100644 Binary files a/screenshots/device/screenshot_002.png and b/screenshots/device/screenshot_002.png differ diff --git a/screenshots/device/screenshot_003.png b/screenshots/device/screenshot_003.png index aaf692ea06d20cc4d7cc7d94de0611da89483f6f..b98aacaf909de36e40cbea04f45f130e8573f1b4 100644 Binary files a/screenshots/device/screenshot_003.png and b/screenshots/device/screenshot_003.png differ diff --git a/screenshots/device/screenshot_004.png b/screenshots/device/screenshot_004.png index 2ce7a3396672efc576170486d314baa2b14a75c0..51b0c61c5cbbd6ab465d5a51d8ca553fe40a7202 100644 Binary files a/screenshots/device/screenshot_004.png and b/screenshots/device/screenshot_004.png differ