Skip to content

Conversation

PaulaKramer
Copy link
Collaborator

@PaulaKramer PaulaKramer commented Dec 6, 2022

Details

  • Talktorial ID: 034
  • Title: [DL Edition] T034: GNN based molecular property prediction
  • Original authors: Paula Kramer
  • Reviewer(s): XXX
  • Date of review: DD-MM-YYYY

Content

  • One line summary: Introduction to Graph Neural Networks for Property Prediction
  • Potential labels or categories (e.g. machine learning, small molecules, online APIs): Machine learning, small molecules, graph neural networks
  • Time it took to execute (approx.): 7 min
  • I have used the talktorial template and followed the content and formatting suggestions there
  • Packages must be open-sourced and should be installable from conda-forge. If you are adding new packages to the TeachOpenCADD environment, please check if already installed packages can perform the same functionality and if not leave a sentence explaining why the new addition is needed. If the new package is not on conda-forge, please list them and their intended usage here.
    • numpy, matplotlib: Already in TeachOpenCADD
    • pytorch 1.12.1, pytorch-cluster 1.6.0, pytorch-scatter 2.1.0, pytorch-sparse 0.6.15, pyg 2.2.0 (conda-forge): I use it for implementing graph neural networks
  • Data must be publicly available, preferably accessible via a webserver or downloadable via a URL. Please list the data resources that you use and how to access them:

Content style

  • Talktorial includes cross-references to other talktorials if applicable
  • The table of contents reflects the talktorial story-line; order of #, ##, ### headers is correct
  • URLs are linked with meaningful words, instead of pasting the URL directly or linking words like here.
  • I have spell-checked the notebook
  • Images have enough resolution to be rendered with quality, without being too heavy.
  • All figures have a description
  • Markdown cell content is still in-line with code cell output (whenever results are discussed)
  • I have checked that cell outputs are not incredibly long (this applies also to DataFrames)
  • Formatting looks correctly on the Sphinx render (bold, italics, figure placing)

Code style

  • Variable and function names follow snake case rules (e.g. a_variable_name vs aVariableName)
  • Spacing follows PEP8 (run Black on the code cells if needed)
  • Code line are under 99 characters each (run black-nb -l 99)
  • Comments are useful and well placed
  • There are no unpythonic idioms like for i in range(len(list)) (see slides)
  • All 3rd party dependencies are listed at the top of the notebook
  • I have marked all code cell with output referenced in markdown cells with the label # NBVAL_CHECK_OUTPUT
  • I have identified potential candidates for a code refactor / useful functions
  • All import ... lines are at the top (practice part) cell, ordered by standard library / 3rd party packages / our own (teachopencadd.*)
  • I have used absolute paths instead of relative paths
    HERE = Path(_dh[-1])
    DATA = HERE / "data"

Website

We present our talktorials on our TeachOpenCADD website (https://projects.volkamerlab.org/teachopencadd/), so we have to check as well if the Jupyter notebook renders nicely there.

  • If this PR adds a new talktorial, please follow these steps:
    • Add your talktorial to the complete list of talktorials here (at the end).
    • Add your talktorial to one or multiple of the collections here. Or propose a new collection section in your PR.
    • Add your talktorial's nblink file by running python generate_nblinks.py from within the directory teachopencadd/docs/talktorials.
    • Please complile the website following the instructions here.
  • Check the rendering of the talktorial of this PR.
  • Is your talktorial listed in the talktorial list?
  • Is your talktorial listed in the talktorial collections?
    • Add a picture for your talktorial in the collection view by following these instructions.

@AndreaVolkamer AndreaVolkamer changed the title Start branch [DL Edition] T034: GNN based molecular property prediction Dec 8, 2022
@AndreaVolkamer AndreaVolkamer added the new-talktorial New talktorial label Dec 8, 2022
@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@gerritgr
Copy link
Collaborator

gerritgr commented Jan 30, 2023

  • GNNs should be defined first as differentiable and trainable, permutation equi(/in)variant functions. The architectures should be introduced as specific instances of such functions.  
  • The relationship between massage passing as a general and powerful framework and GCN/GIN as (less powerful) instances could be clarified: Also the relationship between the aggregation/pooling function (input is a set) and the permutation invariance. 
  • refer to T33 in the introduction.
  • d is overloaded with the degree and the feature dimension.
  • The advantages of a GNN library should be stated (sparse matrices, graph batching), also mention www.dgl.ai.
  • bessere property?  -> ChatGPT suggests: Electronegativity, Ionization potential, Bond angles and distances, no idea if they make sense, though.
  • Say explicitly that the pooling layer is invariant to the order of the input (the same as the aggregation function).
  • True vs predicted value -> would say Ground truth vs prediction
  • Can you give an intuition on what makes GIN more powerful?

@gerritgr gerritgr merged commit f7542d6 into DL_edition Apr 11, 2023
@mbackenkoehler mbackenkoehler deleted the pk-034-gnns branch January 29, 2024 10:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

new-talktorial New talktorial

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants