Enhancing Setgen Efficiency in Rice Research
210 likes | 322 Vues
Explore two use cases for Setgen management at IRRI, covering parental management and seed data quality improvements. Learn how to streamline workflows and ensure data accuracy.
Enhancing Setgen Efficiency in Rice Research
E N D
Presentation Transcript
Splitting Setgen into two use cases? Ruaraidh Sackville Hamilton International Rice Research Institute Los Baños, Philippines
Use cases • Parents managed by user:Entry point = parents;Setgen good • Making crosses • Making selections • Bulking up seed • Parents managed by others:Entry point = offspring;Data quality problems with Setgen • Incoming seeds received from others • (Entering historical data) ICIS developers' workshop
Case 1. User manages parents Step 1: Before trial, create list of existing parental GIDs to be included Step 2: After trial, create list of new progeny GIDs to be created Step 3: Create data for progeny ICIS developers' workshop
Use case 1 - sub-cases:Making selections vs bulking seed Number of progeny GIDs per parent: • Making selections • 0, 1 … N = number of offspring liked by the breeder • DER • Bulking seed • 0 = failure of seed increase • 1 = normal successful seed increase • Usually MAN • (N>1: for the special case of splitting mixed accessions into uniform components) ICIS developers' workshop
Selection and seed increase:Features of GERMPLSM ICIS developers' workshop
Selection and seed increase:Features of NAMES • Typically one name per progeny GID • = Preferred name, NSTAT=1 • Also functions as preferred ID ( ≡ NSTAT=8) • NVAL assigned automatically as f(parental name) • User-specific rules for assigning NVAL: • Selection by IRRI breeder: = NVAL of preferred name of GPID2 & “-N” • Seed increase by IRRI GRC: = NVAL of preferred ID of MGID & “:YYYYSS” ICIS developers' workshop
Selection and seed increase:Features of NAMES ICIS developers' workshop
Use Case 1 summary • Sub-cases for selection and seed multiplication very similar • One Setgen suitable • User-defined customisation to handle the differences • User-defined customisation for ease of use • Setgen Cf GRIMS • Setgen = just workflow for parent offspring GIDs • GRIMS = whole workflow for selecting, growing, processing the harvest, storing the harvest • Setgen is just one element of the workflow controlled by GRIMS • Should Setgen be extended to handle the whole workflow? ICIS developers' workshop
Case 2. User receives seed from others 1: Initial data on batch: LISTNMS, EVENTMEM Need fast routine entry of data without need for expert judgements quick release by SHU 2: Initial data on new GIDs as orphans 3: Upload to central (for external receipts processed by SHU) 4: Search central for existing GIDs representing the parents 5: Update data for new GIDs with parents already in central 6: Create GIDs for parents not already in central 7: Update data for the new GIDs from those parents 8: Scan / file / deposit original documents FILELINK ICIS developers' workshop
Case 2 step 1: batch data • LISTNMS • EVENTMEM links to PERSONS, INSTITUT • Batch ID, batch description, date received, donor person, donor institute • IP conditions e.g. SMTA, SMTA with additional restrictions, other restrictions • FILELINK • To point to original documentation:e-files;Scanned paper documents ICIS developers' workshop
2: Initial data entry for new GIDs:GERMPLSM ICIS developers' workshop
2: Initial data entry for new GIDs:NAMES • Germplasm provider may provide: • ± pedigree info • 0, 1 … N names • Choose name values to enter as • ENTRYCD, SOURCE, DESIG, GRPNAME • Enter in LISTDATA • Create NAMES records • Preferred ID (if specified by user’s rules) • Automatically assigned NVAL by user’s rules • NSTAT=8, NLOCN=GLOCN, NDATE=today, NTYPE=user-specified) • Names provided by provider • With missing NSTAT, NLOCN, NDATE, NTYPE ICIS developers' workshop
4: Searching central for GPID2 • Does central already have a GID representing the provider’s sample? • Issue: • Many GIDs may share the same name • Nothing to indicate what each GID represents • New field GREPRESENTS?? • Guidance from GLOCN, NLOCN, NTYPE, NSTAT, and same fields of candidate’s GPID2 & GPID1 • ot easily seen in GMS_Search • Many errors in GLOCN, NLOCN, NTYPE, NSTAT • IRTP 456: 15 GIDs, 48 errors, 9 missing GIDs • Azucena: 74 GIDs, 27 missing, 10 unidentifiable, > 60% of GPID1-GPID2 values wrong • Inconsistent / inadequate user understanding • Inadequate data validation ICIS developers' workshop
GRepresents values proposed in 2009 • Good candidates for GPID2 • Accession conserved in genebank at GLocN • Breeder's selection or other line produced at GLocN • Sample maintained at GLocN for testing in nurseries • Copy of a genebank accession or breeder's line held informally at GLocN • Possible candidates • Notional GID required for historical pedigree • Inconsistent data • Unvalidated • Not possible as candidates • Cross made at GLocN • Sample collected from field or market at GLocN • Except for new direct accession from field ICIS developers' workshop
4: Searching central for GPID2 • Perfect match: • Provider specifies own preferred ID • Provider uses same ICIS central • Gives GID of own sample • Sample from provider’s curated collection • Gives preferred name & ID as separate identifiers • Sample bred by provider • Line name is only name, serving as ID and name • Candidate GID has • (GID represents sample managed by donor) • GNPGS < 0 • GLOCN = donor’s locid • Name with matching NVAL and: • Name with NLOCN=GLOCN • Only one name, or name with NSTAT=8 ICIS developers' workshop
4: Searching central for GPID2 • Super perfect match = perfect match plus • Provider specifies their donor’s preferred ID • Candidate GID has GPID2 with single name or with preferred ID matching provider’s donor’s preferred ID • Provider specifies the original collected sample ID • Candidate GID has GPID1 with preferred ID having NTYPE=9 and NVAL=collected sample ID • Provider specifies the pedigree • Candidate GID has the same pedigree ICIS developers' workshop
4: Searching central for GPID2 • Imperfect match: • Provider does not specify own preferred ID • Not professional germplasm manager • E.g. Provides only cultivar name or pedigree • Partial match; “matching” name (allowing variants): • “GID represents” not specified • GLOCN ≠ donor’s locid • NLOCN ≠ donor’s locid • Multiple names, none with NSTAT=8 • Provider = genebank, gives accession ID, but no NID with NTYPE=1, NSTAT=8 • Data reliability • Multiple NIDs with same NVAL but inconsistent NLOCN, NDATE, NSTAT, NTYPE, GPID1, GPID2 Potentially unreliable ICIS developers' workshop
4: Searching central for GPID2 • Search: • Calculate % match to donor’s sample • Sort by % match • Calculate reliability ICIS developers' workshop
5: Successful search for GPID2 • Assign • GPID2 := selected candidate • GPID1 := GPID1 of GPID2 • Display reliability and all recorded distinct values of NLOCN, NDATE, NSTAT, NTYPE, GPID1, GPID2 for same NVAL • Expert user corrects wrong data for GPID2 & GPID1 • After correcting GPID2, new GID: • Inherits NLOCN and NDATE of GPID2 • Is assigned NTYPE and NSTAT by user-rules • May be directly inherited e.g. NSTAT=1 • May be changed: e.g. NSTAT=8 for GPID2 NSTAT=0 for new GID ICIS developers' workshop
6: Unsuccessful search for GPID2 • Repeat • Step 3, create GIDs with partial data to represent GPID2 • Steps 4-6, look for and use or created source of GPID2 • Iteration finishes with • Successful search for source of source, or • Source of source = GPID1 ICIS developers' workshop
Intermediate cases • Transfers of seed between users of the same ICIS central • For recipient, like handling incoming seed, but with parent GIDs already defined in provider’s list • Seed increase of mixed accessions, splitting into uniform components as new accessions • Initially like seed increase, but then like receiving new accession ICIS developers' workshop