Skip to content

Conversation

@dimapihtar
Copy link
Collaborator

@dimapihtar dimapihtar commented Jul 10, 2025

Important

The Update branch button must only be pressed in very rare occassions.
An outdated branch is never blocking the merge of a PR.
Please reach out to the automation team before pressing that button.

What does this PR do ?

Removes nlp/language_modelling.

Collection: [Note which collection this PR will affect]

Changelog

  • Add specific line by line info of high level changes in this PR.

Usage

  • You can potentially add a usage example below
# Add a code snippet demonstrating how to use this 

GitHub Actions CI

The Jenkins CI system has been replaced by GitHub Actions self-hosted runners.

The GitHub Actions CI will run automatically when the "Run CICD" label is added to the PR.
To re-run CI remove and add the label again.
To run CI on an untrusted fork, a NeMo user with write access must first click "Approve and run".

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you add or update any necessary documentation?
  • Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
    • Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

  • New Feature
  • Bugfix
  • Documentation

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.

Additional Information

  • Related to # (issue)

Signed-off-by: dimapihtar <[email protected]>
@github-actions github-actions bot added the NLP label Jul 10, 2025
Signed-off-by: dimapihtar <[email protected]>
Signed-off-by: dimapihtar <[email protected]>
@github-actions github-actions bot added the TTS label Jul 15, 2025
@dimapihtar dimapihtar marked this pull request as ready for review July 15, 2025 14:11

Check notice

Code scanning / CodeQL

Cyclic import Note

Import of module
nemo.collections.nlp.modules.common.retro_inference_strategies
begins an import cycle.
dimapihtar and others added 6 commits July 15, 2025 07:42
Signed-off-by: dimapihtar <[email protected]>
Signed-off-by: dimapihtar <[email protected]>
Signed-off-by: dimapihtar <[email protected]>
Signed-off-by: dimapihtar <[email protected]>
Signed-off-by: dimapihtar <[email protected]>
Comment on lines +29 to +32
from nemo.collections.nlp.data.language_modeling.megatron.gpt_sft_chat_dataset import (
_get_header_conversation_type_mask_role,
get_prompt_template_example,
)

Check notice

Code scanning / CodeQL

Unused import Note

Import of 'get_prompt_template_example' is not used.

Copilot Autofix

AI 6 months ago

To fix the problem:

  1. Remove the unused import get_prompt_template_example from the nemo.collections.nlp.data.language_modeling.megatron.gpt_sft_chat_dataset module.
  2. Ensure that the removal does not affect the functionality of the code, as no references to get_prompt_template_example exist in the file.

Detailed steps:

  • Locate the import statement starting on line 29.
  • Remove the specific get_prompt_template_example from the import list while keeping any other imports intact (_get_header_conversation_type_mask_role).

Suggested changeset 1
nemo/collections/nlp/modules/common/text_generation_server.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/nemo/collections/nlp/modules/common/text_generation_server.py b/nemo/collections/nlp/modules/common/text_generation_server.py
--- a/nemo/collections/nlp/modules/common/text_generation_server.py
+++ b/nemo/collections/nlp/modules/common/text_generation_server.py
@@ -28,7 +28,6 @@
 try:
     from nemo.collections.nlp.data.language_modeling.megatron.gpt_sft_chat_dataset import (
         _get_header_conversation_type_mask_role,
-        get_prompt_template_example,
     )
 
     HAVE_NLP = True
EOF
@@ -28,7 +28,6 @@
try:
from nemo.collections.nlp.data.language_modeling.megatron.gpt_sft_chat_dataset import (
_get_header_conversation_type_mask_role,
get_prompt_template_example,
)

HAVE_NLP = True
Copilot is powered by AI and may make mistakes. Always verify output.
Signed-off-by: dimapihtar <[email protected]>
dimapihtar and others added 2 commits July 15, 2025 16:52
Signed-off-by: dimapihtar <[email protected]>
Copy link
Collaborator

@chtruong814 chtruong814 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dimapihtar this is all nemo1 code we're removing?

@chtruong814
Copy link
Collaborator

@dimapihtar I think that last test is failing because the path for helpers.cpp file was renamed.

Signed-off-by: dimapihtar <[email protected]>
Signed-off-by: dimapihtar <[email protected]>
@dimapihtar
Copy link
Collaborator Author

@dimapihtar this is all nemo1 code we're removing?

no, we have nemo.nlp.modules left which will be removed in the next separate PR. It's just complicated to remove everything in a single PR.

@dimapihtar
Copy link
Collaborator Author

@dimapihtar I think that last test is failing because the path for helpers.cpp file was renamed.

It was failing because I forgot to move Makefile in addition to helpers.cpp

@github-actions
Copy link
Contributor

[🤖]: Hi @dimapihtar 👋,

We wanted to let you know that a CICD pipeline for this PR just finished successfully.

So it might be time to merge this PR or get some approvals.

//cc @chtruong814 @ko3n1g @pablo-garay @thomasdhc

@dimapihtar dimapihtar merged commit f2ee5e1 into main Oct 14, 2025
616 of 619 checks passed
@dimapihtar dimapihtar deleted the dpykhtar/remove_language_modelling branch October 14, 2025 16:32
@chtruong814 chtruong814 added the r2.5.0 Cherry-pick label for the 2.5.0 release label Oct 14, 2025
dimapihtar added a commit that referenced this pull request Oct 15, 2025
* remove language_modeling

Signed-off-by: dimapihtar <[email protected]>

* fix imports

Signed-off-by: dimapihtar <[email protected]>

* fix imports

Signed-off-by: dimapihtar <[email protected]>

* Apply isort and black reformatting

Signed-off-by: dimapihtar <[email protected]>

* fix style

Signed-off-by: dimapihtar <[email protected]>

* fix style

Signed-off-by: dimapihtar <[email protected]>

* fix import

Signed-off-by: dimapihtar <[email protected]>

* fix imports

Signed-off-by: dimapihtar <[email protected]>

* Apply isort and black reformatting

Signed-off-by: dimapihtar <[email protected]>

* fix style

Signed-off-by: dimapihtar <[email protected]>

* fix style

Signed-off-by: dimapihtar <[email protected]>

* fix style

Signed-off-by: dimapihtar <[email protected]>

* remove multimodal data unit tests

Signed-off-by: dimapihtar <[email protected]>

* remove nlp

Signed-off-by: dimapihtar <[email protected]>

* remove import checks for nlp

Signed-off-by: dimapihtar <[email protected]>

* Apply isort and black reformatting

Signed-off-by: dimapihtar <[email protected]>

* remove check imposrts for nlp

Signed-off-by: dimapihtar <[email protected]>

* remove check imports for nlp

Signed-off-by: dimapihtar <[email protected]>

* fix nlp imports

Signed-off-by: dimapihtar <[email protected]>

* Apply isort and black reformatting

Signed-off-by: dimapihtar <[email protected]>

* guard imports

Signed-off-by: dimapihtar <[email protected]>

* Apply isort and black reformatting

Signed-off-by: dimapihtar <[email protected]>

* fix style

Signed-off-by: dimapihtar <[email protected]>

* move MegatronPretrainingBatchSampler

Signed-off-by: dimapihtar <[email protected]>

* Apply isort and black reformatting

Signed-off-by: dimapihtar <[email protected]>

* fix import

Signed-off-by: dimapihtar <[email protected]>

* Apply isort and black reformatting

Signed-off-by: dimapihtar <[email protected]>

* fix style

Signed-off-by: dimapihtar <[email protected]>

* Apply isort and black reformatting

Signed-off-by: dimapihtar <[email protected]>

* fix style

Signed-off-by: dimapihtar <[email protected]>

* fix style

Signed-off-by: dimapihtar <[email protected]>

* fix import

Signed-off-by: dimapihtar <[email protected]>

* Apply isort and black reformatting

Signed-off-by: dimapihtar <[email protected]>

* fix imports

Signed-off-by: dimapihtar <[email protected]>

* Apply isort and black reformatting

Signed-off-by: dimapihtar <[email protected]>

* fix import

Signed-off-by: dimapihtar <[email protected]>

* Apply isort and black reformatting

Signed-off-by: dimapihtar <[email protected]>

* fix style

Signed-off-by: dimapihtar <[email protected]>

* remove multimodal test

Signed-off-by: dimapihtar <[email protected]>

* fix import

Signed-off-by: dimapihtar <[email protected]>

* Apply isort and black reformatting

Signed-off-by: dimapihtar <[email protected]>

* remove files

Signed-off-by: dimapihtar <[email protected]>

* resolve merge conflicts

Signed-off-by: dimapihtar <[email protected]>

* revert changes

Signed-off-by: dimapihtar <[email protected]>

* fix nlp data imports

Signed-off-by: dimapihtar <[email protected]>

* Apply isort and black reformatting

Signed-off-by: dimapihtar <[email protected]>

* fix style

Signed-off-by: dimapihtar <[email protected]>

* Apply isort and black reformatting

Signed-off-by: dimapihtar <[email protected]>

* fix style

Signed-off-by: dimapihtar <[email protected]>

* move list_available_models to nlp.modules

Signed-off-by: dimapihtar <[email protected]>

* Apply isort and black reformatting

Signed-off-by: dimapihtar <[email protected]>

* get rid of language modelling modules

Signed-off-by: dimapihtar <[email protected]>

* Apply isort and black reformatting

Signed-off-by: dimapihtar <[email protected]>

* fix style

Signed-off-by: dimapihtar <[email protected]>

* add Makefile

Signed-off-by: dimapihtar <[email protected]>

* add Makefile

Signed-off-by: dimapihtar <[email protected]>

---------

Signed-off-by: dimapihtar <[email protected]>
Signed-off-by: dimapihtar <[email protected]>
Co-authored-by: dimapihtar <[email protected]>
chtruong814 pushed a commit that referenced this pull request Oct 15, 2025
* remove language_modeling



* fix imports



* fix imports



* Apply isort and black reformatting



* fix style



* fix style



* fix import



* fix imports



* Apply isort and black reformatting



* fix style



* fix style



* fix style



* remove multimodal data unit tests



* remove nlp



* remove import checks for nlp



* Apply isort and black reformatting



* remove check imposrts for nlp



* remove check imports for nlp



* fix nlp imports



* Apply isort and black reformatting



* guard imports



* Apply isort and black reformatting



* fix style



* move MegatronPretrainingBatchSampler



* Apply isort and black reformatting



* fix import



* Apply isort and black reformatting



* fix style



* Apply isort and black reformatting



* fix style



* fix style



* fix import



* Apply isort and black reformatting



* fix imports



* Apply isort and black reformatting



* fix import



* Apply isort and black reformatting



* fix style



* remove multimodal test



* fix import



* Apply isort and black reformatting



* remove files



* resolve merge conflicts



* revert changes



* fix nlp data imports



* Apply isort and black reformatting



* fix style



* Apply isort and black reformatting



* fix style



* move list_available_models to nlp.modules



* Apply isort and black reformatting



* get rid of language modelling modules



* Apply isort and black reformatting



* fix style



* add Makefile



* add Makefile



---------

Signed-off-by: dimapihtar <[email protected]>
Signed-off-by: dimapihtar <[email protected]>
Co-authored-by: dimapihtar <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI common Multi Modal NLP r2.5.0 Cherry-pick label for the 2.5.0 release TTS

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants