130 likes | 287 Vues
Cycle Stealing (kind of). Christopher Salembier CSC-8530. Review. Cycle Stealing Network of Workstations (NOW) as a parallel computer Making use of unused resources Workstation may be active or inactive Used in many volunteer computing projects
E N D
Cycle Stealing(kind of) Christopher Salembier CSC-8530
Review • Cycle Stealing • Network of Workstations (NOW) as a parallel computer • Making use of unused resources • Workstation may be active or inactive • Used in many volunteer computing projects • Berkeley Open Infrastructure for Network Computing (BOINC) • Distributed.net • GridGain middleware used • Java based cloud enabling platform • Configured through Spring framework IoC container
(Intended)Project Focus • The cycles / resources themselves • Research findings • Fine-grained kernel-level stealing • Complex algorithms for scheduling and allocation • Existing platforms • Did none of it • Needed C or lower level language • Wouldn’t have any distributed element • “Application” level stealing instead
What Got Missed • Client-Side Issues • GridGain only active within JVM • No recovery on shutdown / offline • Cannot save task • Grid registers as a failure • “Pull” jobs • Bandwidth monitoring • WAN usage
The Distributed Task • Parsing the contents of Wikipedia • 24 GB+ of text • Requires 3 passes • Unused processing power on work servers • Ultimate goal is general text parsing system • General purposes grid for processing large files • Use method injection to configure algorithm
Implementation - Worker public final void unregister() { StealFactory.unregisterThread( this ); } • public abstract Serializable process() throws GridException; public final Serializable execute() throws GridException { Serializable result = null; register(); result = process(); unregister(); return result; } public final void register() { pauseLock = new ReentrantLock() waiting = pauseLock.newCondition(); StealFactory.registerNewThread( this ); if ( StealFactory.isBlockThreadActive() pauseProcess(); } public final void pauseProcess() { if ( !isBlocked ) { isBlocked = true; while ( isBlocked ) waiting.await(); } } public final void wakeProcess() { pauseLock.lock(); isBlocked = false; waiting.signal(); pauseLock.unlock(); }
Implementation - Worker • public void setMonitoredValue( Long value ) { • currentSystemState = value; • if ( isWaiting ) { • lock.lock(); • isWaiting= false; • waiting.signal(); • lock.unlock(); • } • public void run() { • while ( running ) { • checkSystemState(); lock.lock(); • isWaiting= true; • while ( isWaiting ) • waiting.await(); • lock.unlock(); • } • private void checkSystemState() { • if ( currentSystemState >= systemStateThreshold && isBlocking ) { • isBlocking= false; • StealFactory.notifyRegisteredThreads(); • else if ( currentSystemState < systemStateThreshold && !isBlocking ) { • isBlocking= true; • StealFactory.blockRegisteredThreads(); • } public void addTask( CycleSteal job ) { synchronized ( mutex ) registeredJobs.add( job ); } public void removeTask( CycleSteal job ) { synchronized ( mutex ) { registeredJobs.remove( job ); } public void blockTasks() { synchronized ( mutex ) { for ( final CycleSteal job : registeredJobs ) { new Thread( new Runnable() { public void run() job.pauseProcess(); } ).start(); } public void notifyTasks() { synchronized( mutex ) { for ( CycleSteal job : registeredJobs ) { job.wakeProcess(); } }
Implementation - Master public void onFinished( GridTaskFuture<?> taskFuture ) { GridNode node = null; HashMap<String, String> map = null; UUID sesID = taskFuture.getTaskSession().getId(); node = assignThread.removeFuture( sesID ); map = (HashMap<String, String>)taskFuture.get(); processJobResult( map ); if ( !assignThread.isRunning() ) { if ( !assignThread.hasWorkOutstanding() ) shutdownMasterNode(); else assignJobsToNode( node ); }
Configuration • All components configured as Spring beans • Application Context XML file • Inversion of Control pattern • Allows run-time configuration • Grid Itself • Node type • Master / Worker • User / CPU monitor • Stealing properties
Configuration <bean id="blockManager" class="edu.villanova.salembier.csc8530.steal.block.StealBlockManager" scope="singleton" /> <bean id="blockThread" class="edu.villanova.salembier.csc8530.steal.block.BlockThreadLong" scope="singleton"> <constructor-arg type="long" value="955"/> </bean> <bean id="activityMonitor" class="edu.villanova.salembier.csc8530.steal.monitor.SystemActivityMonitor" scope="singleton" > <constructor-arg type="long" value="500"/> </bean> <bean id="stealFactory" class="edu.villanova.salembier.csc8530.steal.StealFactoryLong" scope="singleton" > <constructor-arg type="edu.villanova.salembier.csc8530.steal.block.BlockManager" ref="blockManager"/> <constructor-arg type="edu.villanova.salembier.csc8530.steal.block.BlockThread" ref="blockThread"/> <constructor-arg type="edu.villanova.salembier.csc8530.steal.monitor.ActivityMonitor" ref="activityMonitor"/> </bean> <bean id="grid.cfg" class="org.gridgain.grid.GridConfigurationAdapter" scope="singleton"> <property name="gridName" value="CSC 8530 GRID"/> <property name="userAttributes"> <map> <entry key="nodeType" value="cpu" /> </map> </property> <property name="gridLogger"> <bean class="org.gridgain.grid.logger.log4j.GridLog4jLogger"> <constructor-arg type="java.lang.String" value="config/default-log4j.xml"/> </bean> </ property> <property name="executorService"> <bean class="edu.villanova.salembier.csc8530.execute.StealExecutorService" > <constructor-arg type="int" value="10"/> <constructor-arg type="int" value="25"/> </bean> </property> </bean>
Testing • No UI to verify • Reliant on log files • Massive amount of data generated • Log entries not always sequential • 1st Round of Testing failed • Though code was working • Block Thread & Manager needed significant rework • Though difficult, log ultimately helped solve problem
Demo • I hope this works!!! • Run nodes of each types • View logs • View output
Conclusion • Original Goal not really met • Project a Success • Learned a lot about… • Cycle stealing • Complex multi-thread Java systems • General distributed programming issues