Add the option to run a slave daemon on the master node. #7

cmd-ntrf · 2015-01-06T17:16:23Z

While testing some scenarios with Spark on our cluster, I realized that most of the time the master node was idle.

I propose to add a parameter, for now called MAGPIE_MASTER_IS_SLAVE, that when set to "yes" adds the master to the list of slaves. By default, this option is off.

The parameter could be renamed to something more meaningful. This is what came first to my mind. I supposed that behaviour is not proper to Spark, therefore the choice of prefix (MAGPIE).

chu11 · 2015-01-06T18:33:12Z

This seems like a good idea. Question though. Have you only tried this with Spark? I'm wondering how it would work with other stuff. The Hadoop HDFS setup is one I'm wondering if will work, although I believe it would.

Instead of "MAGPIE_MASTER_IS_SLAVE" how about "MAGPIE_RUN_SLAVE_ON_MASTER", which (atleast to me) sounds more clear about what the option is doing?

Would also like to see a more clear description in the comments. How about something like

"By default, the master node will only run master daemons, such as the Hadoop NameNode, Hbase Master, Spark Master, etc.

On certain workloads the amount of memory and processing used on the master may be low enough it would be worthwhile to also run slave node daemons on the master (e.g. Hadoop DataNode, Hbase RegionServer, Spark Worker, etc.). By setting the below to yes, you will inform Magpie to also run slave node daemons on the master. If enabled, you are advised to adjust heaps and memory usage of all necessary daemons to ensure that resources are allocated appropriately to ensure decent performance.

By default, this is turned off."

I dunno. That's a start. Feel free to adjust as needed.

Also, could the option be before the MAGPIE_REMOTE_CMD/MAGPIE_REMOTE_CMD_OPTS. I'd like to keep those as the last options within that section.

Thanks

cmd-ntrf · 2015-01-06T21:18:47Z

I have updated the variable name and the documentation based on your suggestion.

I have yet tried the parameter effect on something else than Spark. I wanted to submit you the idea with a working patch to test the water and see if there was something I had missed before doing further testing.

Do you have suggestion other than simply using the different plugin test example to make sure everything works properly with MAGPIE_RUN_SLAVE_ON_MASTER=yes? I recommend we leave this request unmerged until I get the results. I will keep you updated.

chu11 · 2015-01-06T23:42:50Z

I think all the plugin tests are a good start. Atleast it's a good sanity test if things work. If Hadoop terasort works w/ hdfsoverlustre (or any network file system), that's high confidence things are good.

Also realized we may need to tweak the hdfs federation code in magpie-setup-core just below your changes.

I think all we'd have to do is put an if statement around

tail -n+${HDFS_FEDERATION_NAMENODE_COUNT} ${MAGPIE_LOCAL_DIR}/tmp_slaves > ${MAGPIE_LOCAL_DIR}/tmp_slaves_tmp

and not do the above if MAGPIE_RUN_SLAVE_ON_MASTER = yes.

cmd-ntrf · 2015-01-08T21:38:27Z

Hadoop terasort with hdfsoverlustre is working with MASTER_RUN_SLAVE_ON_MASTER="yes". I have yet tested the other plugins, excepting Spark.

Add the option to run a slave daemon on the master node.

e5592f6

cmd-ntrf force-pushed the master_is_slave branch from d7cefab to e5592f6 Compare January 6, 2015 20:49

cmd-ntrf added 3 commits January 7, 2015 09:12

Merge branch 'master' into master_is_slave

8f967de

Replace full interdepency in run-pdsh by specific check.

22b997d

Simplify expression of noderank computation with Moab/Torque.

ef6ffe6

Merge github.com:chu11/magpie into master_is_slave

7a6c9bf

chu11 force-pushed the master branch 3 times, most recently from 558455c to 06da05d Compare June 2, 2015 18:34

cmd-ntrf closed this May 18, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add the option to run a slave daemon on the master node. #7

Add the option to run a slave daemon on the master node. #7

cmd-ntrf commented Jan 6, 2015

chu11 commented Jan 6, 2015

cmd-ntrf commented Jan 6, 2015

chu11 commented Jan 6, 2015

cmd-ntrf commented Jan 8, 2015

Add the option to run a slave daemon on the master node. #7

Add the option to run a slave daemon on the master node. #7

Conversation

cmd-ntrf commented Jan 6, 2015

chu11 commented Jan 6, 2015

cmd-ntrf commented Jan 6, 2015

chu11 commented Jan 6, 2015

cmd-ntrf commented Jan 8, 2015