Skip to content

Conversation

sakshimisra
Copy link
Contributor

@sakshimisra sakshimisra commented Oct 11, 2020

Talktorial ID: 021

  • Title: The One-Hot Encoding concept
  • Original authors: Sakshi Misra
  • Reviewer(s): Talia B. Kimber, Prof. Dr. Andrea Volkamer, Yonghui Chen
  • Date of review: 10/2020 - present

Content review

  • Potential labels or categories : One Hot Encoding, SMILES, sklearn, Keras
  • One line summary: Perform one-hot encoding on SMILES structures by implementing own function, by using implementations in sklearn and keras and then compare their performance in terms of time each implementation takes to execute.
  • The table of contents reflects the talktorial story-line; order of #, ##, ### headers is correct
  • URLs are linked with meaningful words, instead of pasting the URL directly or linking words like here.
  • I have spell-checked the notebook
  • Images have enough resolution to be rendered with quality, without being too heavy.
  • All figures have a description
  • Markdown cell content is still in-line with code cell output (whenever results are discussed)
  • I have checked that cell outputs are not incredibly long (this applies also to DataFrames)
  • Formatting looks correctly on the Sphinx render (bold, italics, figure placing)

Code review

  • Time it took to execute (approx.):
  • Variable and function names follow snake case rules (e.g. a_variable_name vs aVariableName)
  • Spacing follows PEP8 (run Black on the code cells if needed)
  • Code line are under 99 characters each (run black -l 99)
  • Comments are useful and well placed
  • There are no unpythonic idioms like for i in range(len(list)) (see slides)
  • All 3rd party dependencies are listed at the top of the notebook
  • I have marked all code cell with output referenced in markdown cells with the label # NBVAL_CHECK_OUTPUT
  • I have identified potential candidates for a code refactor / useful functions
  • All import ... lines are at the top (practice part) cell, ordered by standard library / 3rd party packages / our own (teachopencadd.*)
  • I have update the relative paths to absolute paths.
  • List here unfamiliar libraries you find in the imports and their intended use:

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@sakshimisra sakshimisra changed the base branch from master to reviews October 11, 2020 11:06
Base automatically changed from reviews to packaging October 22, 2020 12:13
@sakshimisra sakshimisra requested a review from t-kimber October 25, 2020 16:07
@sakshimisra sakshimisra self-assigned this Oct 25, 2020
@AndreaVolkamer
Copy link
Member

@sakshimisra well done (thanks for your help @t-kimber)!
There are only one or two small things that we'll discuss offline. @jaimergp could you go ahead with merging, as soon as one of us gives you the final go! Thanks everyone.

@jaimergp
Copy link
Contributor

jaimergp commented Jan 18, 2021

How come there are other notebooks affected? Ah, this was opened way before the big merge. I'll need some time to go through the conflicts, but I'll have it done this week.

-- edit -- as per my conversation with Andrea, we will wait until I am pinged again to solve the conflicts and merge. Thanks!

@AndreaVolkamer
Copy link
Member

@sakshimisra thanks for the update, well done.

I reviewed the shortened version of the notebook and checked/adapted mainly these points

  • consistent usage of one-hot encoding (vs. one hot)
  • SMILES in text vs smiles in code
  • text and parameters in doc strings, also including/adapting parameters making sure that no global variables are used
  • small typos and textual adaptions
  • in replacement: excluded Cn since it is rather carbon (C) + aromatic nitrogen (n)
  • excluded compound from cvs file that only contained [Cl-]

Final Todos, please:

  • Check the website rendering
  • Delete temporary notebook
  • While blog posts are very illustrative and easy to understand, please include a more sophisticated reference for OHE if possible
  • Maybe we can adapt the discussion slightly to comment more on our field (CADD), how OHE can be used, maybe also potential shortcomings of the SMILES preprocessing, not sure if the NLP comment is completely clear.

Thanks.

@t-kimber t-kimber changed the base branch from master to t011-base April 26, 2021 13:24
@jaimergp
Copy link
Contributor

Thanks everybody, especially @sakshimisra of course! Awesome work ❤️

I cleaned the history a bit as per #99 and will close this as a result.

@jaimergp jaimergp closed this Apr 26, 2021
@jaimergp jaimergp deleted the sm-019-review branch April 26, 2021 14:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

new-talktorial New talktorial

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants