SMILES Pair Encoding (JCIM) first learns a vocabulary of high frequency SMILES substrings from a large chemical dataset (e.g., ChEMBL) and then tokenizes SMILES based on the learned vocabulary for ...
CodeSim is a research toolkit that implements and benchmarks 23 different unsupervised similarity measures for detecting code clones in Java source code. This work addresses the critical challenge of ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results