Data Creation for Machine Learning

Topics related to Spinach package
Post Reply
schmidnicolas
Posts: 2
Joined: Thu Mar 02, 2023 10:26 am

Data Creation for Machine Learning

Post by schmidnicolas »

Dear Spinach experts,
I want to create several simple 1d and 2d spectra, e.g.1H, 13C, HSQC and HMBC, of a significant amount of small molecules (>10k) to train a machine learning model. The more information I have about the spin systems and corresponding molecules as potential labels for machine learning, the better.

Would this data creation be possible with spinach in an automated fashion, or would you suggest other tools/databases for this?
If yes, what would this procedure potentially look like?
Are there any databases of small molecules where I could potentially import the needed information to simulate this amount of molecules, respectively spectra?
What about ambiguity? What measurements/spectra of small molecules are needed to identify the spin system or even molecule for the most "common cases" with high confidence?

Thanks in advance for your answer.

Best,

Nicolas
kuprov
Posts: 125
Joined: Mon Mar 29, 2021 4:26 pm

Re: Data Creation for Machine Learning

Post by kuprov »

Should be no problem in Spinach - use one of the existing large HSQC examples, get it to read your specifications instead of (currently) sucrose, and get it to write the results out in the format that you like.
schmidnicolas
Posts: 2
Joined: Thu Mar 02, 2023 10:26 am

Re: Data Creation for Machine Learning

Post by schmidnicolas »

Thanks for the fast answer. I'll give it a shot.
Post Reply