[NPU] adaptation for LLaMA #7262

ShuZihan · 2023-10-17T15:10:16Z

PR types

PR changes

Description

for export npu model

适配

绑核运行

修复绑核脚本

paddle-bot · 2023-10-17T15:10:22Z

Thanks for your contribution!

CLAassistant · 2023-10-17T15:10:23Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
0 out of 4 committers have signed the CLA.

❌ yuanwei66
❌ MyAngelAyase
❌ bmers
❌ max-niu

yuanwei66 seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

fix accuracy bugs

fix bug

调整模型目录，放开benchma分支

适配静态多batch att mask

save_mem

调整非benchmark输入

优化最大batch数，修改增量attention mask为向量

add hccl_buffsize control

优化内存，避免大batch attention mask的操作

优化显存，避免对attention mask tenso频繁操作，造成内存碎片化。并使用加速库算子，避免numpy运算慢问题

使能weight transpose功能，精度验证OK，配合PR:MyAngelAyase/PaddleCustomDevice#66

export NZ

Pad/UnPad 接入适配（需要重新导出模型）

revert NZ

Update modeling.py for embeding

update for master branch && position embeding speed up

github-actions · 2024-04-27T00:17:34Z

This Pull Request is stale because it has been open for 60 days with no activity. 当前Pull Request 60天内无活动，被标记为stale。

github-actions · 2024-09-23T00:21:22Z

This Pull Request is stale because it has been open for 60 days with no activity. 当前Pull Request 60天内无活动，被标记为stale。

github-actions · 2025-01-23T00:21:21Z

This Pull Request is stale because it has been open for 60 days with no activity. 当前Pull Request 60天内无活动，被标记为stale。

max-niu and others added 13 commits October 13, 2023 15:36

Update export_model.py

fc636cb

Update modeling.py

64ecb86

Update fused_transformer_layers.py

9d23690

Update generation_utils.py

44a4e1c

Update modeling.py

d0eb94f

Update predictor.py

72e296b

Merge pull request #2 from max-niu/develop

4493559

for export npu model

适配

96e4ff5

Merge pull request #3 from bmers/develop

03150ed

适配

绑核运行

b57da83

Merge pull request #4 from bmers/10_17_llama

65bd9c8

绑核运行

修复

a21b5f3

Merge pull request #5 from bmers/10_17_llama

2b61ae0

修复绑核脚本

paddle-bot bot added the contributor label Oct 17, 2023

MyAngelAyase and others added 14 commits October 22, 2023 02:56

fix accuracy bugs

01b0e68

Merge pull request #6 from MyAngelAyase/for_accuracy

52c97f2

fix accuracy bugs

fix bug

85b5742

Merge pull request #7 from MyAngelAyase/for_accuracy

ea25676

fix bug

调整模型目录，放开benchma分支

99fafff

Merge pull request #8 from bmers/10_22_llama

50fd233

调整模型目录，放开benchma分支

适配静态多batch att mask

17eda01

Merge pull request #9 from bmers/10_26_llama

a67162a

适配静态多batch att mask

save_mem

4d45b31

Merge pull request #10 from MyAngelAyase/save_mem

f81eb3c

save_mem

Update export_model.py

c7b6f23

调整非benchmark输入

9c1f686

Merge pull request #12 from bmers/10_28_llama

7fefdda

调整非benchmark输入

优化最大batch数

fce522d

yuanwei66 and others added 20 commits November 2, 2023 17:30

优化最大batch数，修改增量attention mask为向量

dfab188

Merge pull request #14 from bmers/11_1_llama

07f54a3

优化最大batch数，修改增量attention mask为向量

add hccl_buffsize control

5d84576

Merge pull request #16 from MyAngelAyase/comm

45bfb02

add hccl_buffsize control

优化内存，避免大batch attention mask的操作

6266714

Merge pull request #15 from bmers/11_4_llama

4d38ec9

优化内存，避免大batch attention mask的操作

优化显存，避免对attention mask tenso频繁操作，造成内存碎片化。并使用加速库算子，避免numpy运算慢问题

fd54962

Merge pull request #17 from bmers/11_10_llama

c55e286

优化显存，避免对attention mask tenso频繁操作，造成内存碎片化。并使用加速库算子，避免numpy运算慢问题

Pad/UnPad 接入适配（需要重新导出模型）

d2013a3

Merge pull request #11 from max-niu/develop

6ef6c68

使能weight transpose功能，精度验证OK，配合PR:MyAngelAyase/PaddleCustomDevice#66

export NZ

0f7fa8a

Merge branch 'develop' into comm

f0132bc

Merge pull request #19 from MyAngelAyase/comm

5578647

export NZ

Merge pull request #18 from bmers/11_13_llama

879c20a

Pad/UnPad 接入适配（需要重新导出模型）

revert NZ

a982106

Merge pull request #20 from MyAngelAyase/revert_NZ

ffa353d

revert NZ

Update modeling.py for embeding

b172f1e

Merge pull request #21 from MyAngelAyase/MyAngelAyase-patch-1

913d569

Update modeling.py for embeding

update for master branch && position embeding speed up

3b5747d

Merge pull request #22 from MyAngelAyase/for_speed_master

52c161e

update for master branch && position embeding speed up

paddle-bot bot assigned wawltor Feb 26, 2024

github-actions bot added the stale label Apr 27, 2024

github-actions bot removed the stale label Jul 11, 2024

github-actions bot added the stale label Sep 23, 2024

github-actions bot removed the stale label Oct 16, 2024

github-actions bot added the stale label Jan 23, 2025

github-actions bot removed the stale label Mar 23, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[NPU] adaptation for LLaMA #7262

[NPU] adaptation for LLaMA #7262

Uh oh!

ShuZihan commented Oct 17, 2023

Uh oh!

paddle-bot bot commented Oct 17, 2023

Uh oh!

CLAassistant commented Oct 17, 2023 •

edited

Loading

Uh oh!

github-actions bot commented Apr 27, 2024

Uh oh!

github-actions bot commented Sep 23, 2024

Uh oh!

github-actions bot commented Jan 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

[NPU] adaptation for LLaMA #7262

Are you sure you want to change the base?

[NPU] adaptation for LLaMA #7262

Uh oh!

Conversation

ShuZihan commented Oct 17, 2023

PR types

PR changes

Description

Uh oh!

paddle-bot bot commented Oct 17, 2023

Uh oh!

CLAassistant commented Oct 17, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Apr 27, 2024

Uh oh!

github-actions bot commented Sep 23, 2024

Uh oh!

github-actions bot commented Jan 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

CLAassistant commented Oct 17, 2023 •

edited

Loading